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NUCLEOTIDE AND PROTEIN SEQUENCES OF 
VERTEBRATE SERRATE G ENES AND METHODS BASED THEREON 

This invention was made in part with government 
5 support under Grant numbers GM 29093 and NS 26084 awarded by 
the Department of Health and Human Services. The government 
has certain rights in the invention. 

1. INTRODUCTION 
The present invention relates to vertebrate Serrate 
genes and their encoded protein products, as well as 
derivatives and analogs thereof. Production of vertebrate 
Serrate proteins, derivatives, and antibodies is also 
provided. The invention further relates to therapeutic 
compositions and methods of diagnosis and therapy. 

2 . BACKGROUND OF THE INVENTION 
Genetic analyses in Drosophila have been extremely 
useful in dissecting the complexity of developmental pathways 
2 0 and identifying interacting loci. However, understanding the 
precise nature of the processes that underlie genetic 
interactions requires a knowledge of the protein products of 
the genes in question. 

Embryo logical, genetic and molecular evidence 
25 indicates that the early steps of ectodermal differentiation 
in Drosophila depend on cell interactions (Doe and Goodman, 

1985, Dev. Biol. 111:206-219; Technau and Campos-Ortega, 

1986, Dev. Biol. 195:445-454; Vassin et al., 1985, J. 
Neurogenet. 2:291-308; de la Concha et al., 1988, Genetics 

30 118:499-508; Xu et al., 1990, Genes Dev. 4:464-475; 
Artavanis-Tsakonas, 1988, Trends Genet. 4:95-100). 
Mutational analyses reveal a small group of zygotically- 
acting genes, the so called neurogenic loci, which affect the 
choice of ectodermal cells between epidermal and neural 

35 pathways (Poulson, 1937, Proc. Natl. Acad. Sci. 23:133-137; 
Lehmann et al., 1983, Wilhelm Roux's Arch. Dev. Biol. 192:62- 
74; JUrgens et al. , 1984, Wilhelm Roux's Arch. Dev. Biol. 
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193:283-295; Wieschaus et al. # 1984, Wilhelm Roux's Arch. 
Dev. Biol. 193:296-307; Niisslein-Volhard et al., 1984, 
Wilhelm Roux's Arch. Dev. Biol. 193:2 67-282). Null mutations 
in any one of the zygotic neurogenic loci — Notch (N) , Delta 
5 (Dl) , mastermind (mam), Enhancer of Split (E(spl), neuralized 
(neu) , and big brain (bib) — result in hypertrophy of the 
nervous system at the expense of ventral and lateral 
epidermal structures. This effect is due to the misrouting 
of epidermal precursor cells into a neuronal pathway, and 

10 implies that neurogenic gene function is necessary to divert 
cells within the neurogenic region from a neuronal fate to an 
epithelial fate. Serrate has been identified as a genetic 
unit capable of interacting with the Notch locus (Xu et al., 
1990, Genes Dev. 4:464-475). These genetic and developmental 

15 observations have led to the hypothesis that the protein 
products of the neurogenic loci function as components of a 
cellular interaction mechanism necessary for proper epidermal 
development (Artavanis-Tsakonas , S., 1988, Trends Genet. 
4 : 95-100) . 

20 Mutational analyses also reveal that the action of 

the neurogenic genes is pleiotropic and is not limited solely 
to embryogenesis. For example, ommatidial, bristle and wing 
formation, which are known also to depend upon cell 
interactions, are affected by neurogenic mutations (Morgan et 

25 al., 1925, Bibliogr. Genet. 2: 1-226; Welshons, 1956, Dros . 
Inf. Serv. 30:157-158; Preiss et al., 1988, EMBO J. 7:3917- 
3927; Shellenbarger and Mohler, 1978, Dev. Biol. 62:432-446; 
Technau and Campos-Ortega, 1986, Wilhelm Roux's Dev. Biol. 
195:445-454; Tomlison and Ready, 1987, Dev. Biol. 120:366- 

30 376; Cagan and Ready, 1989, Genes Dev. 3:1099-1112). 

Sequence analyses (Wharton et al., 1985, Cell 
43:567-581; Kidd and Young, 1986, Mol . Cell. Biol. 6:3094- 
3108; Vassin, et al., 1987, EMBO J . 6:3431-3440; Kopczynski, 
et al., 1988, Genes Dev. 2:1723-1735) have shown that two of 

35 the neurogenic loci, Notch and Delta, appear to encode 

transmembrane proteins that span the membrane a single time. 
The Notch gene encodes a -300 kd protein (we use "Notch" to 
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denote this protein) with a large N-terminal extracellular 
domain that includes 36 epidermal growth factor (EGF)-like 
tandem repeats followed by three other cysteine-rich repeats, 
designated Notch/lin-12 repeats (Wharton, et al., 1985, Cell 
5 43:567-581; Kidd and Young, 1986, Mol. Cell. Biol. 6:3094- 
3108; Yochem, et al., 1988, Nature 335:547-550). Delta 
encodes a -100 kd protein (we use "Delta" to denote DLZM, the 
protein product of the predominant zygotic and maternal 
transcripts; Kopczynski, et al., 1988, Genes Dev. 2:1723- 
10 1735) that has nine EGF-like repeats within its extracellular 
domain (Vassin, et al., 1987, EMBO J. 6:3431-3440; 
Kopczynski, et al., 1988, Genes Dev. 2:1723-1735). Molecular 
studies have lead to the suggestion that Notch and Delta 
constitute biochemically interacting elements of a cell 
15 communication mechanism involved in early developmental 
decisions (Fehon et al., 1990, Cell 61:523-534). 

The EGF-like motif has been found in a variety of 
proteins, including those involved in the blood clotting 
cascade (Furie and Furie, 1988, Cell 53: 505-518). In 
20 particular, this motif has been found in extracellular 

proteins such as the blood clotting factors IX and X (Rees et 
al., 1988, EMBO J. 7:2053-2061; Furie and Furie, 1988, Cell 
53: 505-518), in other Drosophila genes (Knust et al., 1987 
EMBO J. 761-766; Rothberg et al., 1988, Cell 55:104 7-1059), 
25 and in some cell-surface receptor proteins, such as 

thrombomodulin (Suzuki et al., 1987, EMBO J. 6:1891-1897) and 
LDL receptor (Sudhof et al., 1985, Science 228:815-822). A 
protein binding site has been mapped to the EGF repeat domain 
in thrombomodulin and urokinase (Kurosawa et al., 1988, J. 
30 Biol. Chem 263:5993-5996; Appella et al. , 1987, J. Biol. 

Chem. 262:4437-4440). The Drosophila Serrate gene has been 
cloned and characterized (PCT Publication WO 93/12141 dated 
June 24, 1993). However, prior to the present invention, 
despite attempts to achieve the same, no vertebrate Serrate 
35 gene was available. 
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Citation of references hereinabove shall not be 
construed as an admission that such references are prior art 
to the present invention. 



5 3. SUMMARY OF THE INVENTION 

The present invention relates to nucleotide 
sequences of vertebrate Serrate genes (human Serrate and 
related genes of other species) , and amino acid sequences of 
their encoded proteins, as well as derivatives (e.g., 

10 fragments) and analogs thereof. Nucleic acids hybridizable 
to or complementary to the foregoing nucleotide sequences are 
also provided. In a specific embodiment, the Serrate protein 
is a human protein. 

The invention relates to vertebrate Serrate 

15 derivatives and analogs of the invention which are 

functionally active, i.e., they are capable of displaying one 
or more known functional activities associated with a full- 
length (wild-type) Serrate protein. Such functional 
activities include but are not limited to antigenicity 

20 [ability to bind (or compete with Serrate for binding) to an 
anti-Serrate antibody] , immunogenicity (ability to generate 
antibody which binds to Serrate) , ability to bind (or compete 
with Serrate for binding) to Notch or other toporythmic 
proteins or fragments thereof ("adhesiveness") , ability to 

25 bind (or compete with Serrate for binding) to a receptor for 
Serrate. "Toporythmic proteins" as used herein, refers to 
the protein products of Notch, Delta, Serrate, Enhancer of 
split, and Deltex, as well as other members of this 
interacting gene family which may be identified, e.g., by 

30 virtue of the ability of their gene sequences to hybridize, 
or their homology to Delta, Serrate, or Notch, or the ability 
of their genes to display phenotypic interactions. 

The invention further relates to fragments (and 
derivatives and analogs thereof) of vertebrate Serrate which 

35 comprise one or more domains of the Serrate protein, 
including but not limited to the intracellular domain, 
extracellular domain, transmembrane domain, membrane- 

- 4 - 
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associated region, or one or more EGF-like (homologous) 
repeats of a Serrate protein, or any combination of the 
foregoing. 

Antibodies to vertebrate Serrate, its derivatives 
5 and analogs, are additionally provided. 

Methods of production of the vertebrate Serrate 
proteins, derivatives and analogs, e.g., by recombinant 
means, are also provided. 

The present invention also relates to therapeutic 
10 and diagnostic methods and compositions based on vertebrate 
serrate proteins and nucleic acids. The invention provides 
for treatment of disorders of cell fate or differentiation by 
administration of a therapeutic compound of the invention. 
Such therapeutic compounds (termed herein "Therapeutics") 
15 include: vertebrate Serrate proteins and analogs and 
derivatives (including fragments) thereof; antibodies 
thereto; nucleic acids encoding the vertebrate Serrate 
proteins, analogs, or derivatives; and vertebrate Serrate 
antisense nucleic acids. m a preferred embodiment, a 
20 Therapeutic of the invention is administered to treat a 
cancerous condition, or to prevent progression from a pre- 
neoplastic or non-malignant state into a neoplastic or a 
malignant state. In other specific embodiments, a 
Therapeutic of the invention is administered to treat a 
2 5 nervous system disorder or to promote tissue regeneration and 
repair. 

In one embodiment, Therapeutics which antagonize, 
or inhibit. Notch and/or Serrate function (hereinafter 
"Antagonist Therapeutics") are administered for therapeutic 
36 effect - m another embodiment. Therapeutics which promote 
Notch and/or Serrate function (hereinafter "Agonist 
Therapeutics") are administered for therapeutic effect. 

Disorders of cell fate, in particular 
hyperproliferative (e.g., cancer) or hypoprolif erative 
35 disorders, involving aberrant or undesirable levels of 
expression or activity or localization of Notch and/or 
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Serrate protein can be diagnosed by detecting such levels, as 
described more fully infra. 

In a preferred aspect, a Therapeutic of the 
invention is a protein consisting of at least a fragment 
5 (termed herein "adhesive fragment") of a vertebrate Serrate 
which mediates binding to a Notch protein or a fragment 
thereof. 

3.1. DEFINITIONS 

10 As used herein, underscoring or italicizing the 

name of a gene shall indicate the gene, in contrast to its 
encoded protein product which is indicated by the name of the 
gene in the absence of any underscoring. For example, 
"Serrate" shall mean the Serrate gene, whereas "Serrate" 

15 shall indicate the protein product of the Serrate gene. 

4. PFSCJUPTjlON OF THE FIGURE S 
Figure 1. Nucleotide sequence (SEQ ID NO:lj and 
protein sequence (SEQ ID NO: 2) of Human Serrate-1 (also known 
20 as Human Jagged-1 (HJ1) ) . 

Figure 2. "Complete" nucleotide sequence 
(SEQ ID NO: 3) and amino acid sequence (SEQ ID NO: 4) of Human 
Serrate-2 (also known as Human Jagged-2 (HJ2) ) generated on 
the computer by combining the sequence of clones pBSlS and 
25 pBS3-2 isolated from human fetal brain cDNA libraries. There 
is a deletion of approximately 120 nucleotides in the region 
of this sequence which encodes the portion of Human Serrate-2 
between the signal sequence and the beginning of the DSL 
domain. 

30 Figure 3. Nucleotide sequence (SEQ ID NO: 5) of 

chick Serrate (C-Serrate) cDNA. 

Figure 4. Amino acid sequence (SEQ ID NO: 6) of 
C-Serrate (lacking the amino-terminus of the signal 
sequence) . The putative cleavage site following the signal 

35 sequence (marking the predicted amino-terminus of the mature 
protein) is marked with an arrowhead; the DSL domain is 
indicated by asterisks; the EGF-like repeats (ELRs) are 

- 6 - 
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underlined with dashed lines; the cysteine rich region 
between the ELRs and the transmembrane domain is marked 
between arrows, and the single transmembrane domain (between 
amino acids 1042 and 1066) is shown in bold. 
5 Figure 5. Alignment of the amino terminal 

sequences of Drosophila melanogaster Delta (SEQ ID NO: 7) and 
Serrate (SEQ ID N0:8) with C-Serrate (SEQ ID NO:6). The 
region shown extends from the end of the signal sequence to 
the end of the DSL domain. The DSL domain is indicated. 

10 Identical amino acids in all three proteins are boxed. 

Figure 6. Diagram showing the domain structures of 
Drosophila Delta and Drosophila Serrate compared with 
C-Serrate. The second cysteine-rich region just downstream 
of the EGF repeats, present only in C-Serrate and Drosophila 

15 Serrate, is not shown. Hydrophobic regions are shown in 
black; DSL domains are checkered and EGF-like repeats are 
hatched . 

5. DETAILED DESCRIPTION OF THE INVENTION 

20 The present invention relates to nucleotide 

sequences of vertebrate Serrate genes, and amino acid 
sequences of their encoded proteins. The invention further 
relates to fragments and other derivatives, and analogs, of 
vertebrate Serrate proteins. Nucleic acids encoding such 

25 fragments or derivatives are also within the scope of the 
invention. The invention provides vertebrate Serrate genes 
and their encoded proteins of many different species. The 
Serrate genes of the invention include human Serrate and 
related genes (homologs) in vertebrate species. In specific 

30 embodiments, the Serrate genes and proteins are from mammals. 
In a preferred embodiment of the invention, the Serrate 
protein is a human protein. In most preferred embodiments, 
the Serrate protein is Human Serrate-1 or Human Serrate-2. 
Production of the foregoing proteins and derivatives, e.g., 

3 5 by recombinant methods, is provided. 

The invention relates to vertebrate Serrate 
derivatives and analogs of the invention which ar 
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functionally active, i.e., they are capable of displaying one 
or more known functional activities associated with a full- 
length (wild-type) Serrate protein. Such functional 
activities include but are not limited to antigenicity 
5 [ability to bind (or compete with Serrate for binding) to an 
anti-Serrate antibody], immunogenic ity (ability to generate 
antibody which binds to Serrate) , ability to bind (or compete 
with Serrate for binding) to Notch or other toporythmic 
proteins or fragments thereof ("adhesiveness"), ability to 
10 bind (or compete with Serrate for binding) to a receptor for 
Serrate. "Toporythmic proteins" as used herein, refers to 
the protein products of Notch, Delta, Serrate, Enhancer of 
split, and Deltex, as well as other members of this 
interacting gene family which may be identified, e.g., by 
15 virtue of the ability of their gene sequences to hybridize, 
or their homology to Delta, Serrate, or Notch, or the ability 
of their genes to display phenotypic interactions. 

The invention further relates to fragments (and 
derivatives and analogs thereof) of a vertebrate Serrate 
20 which comprise one or more domains of the Serrate protein, 
including but not limited to the intracellular domain, 
extracellular domain, transmembrane domain, membrane- 
associated region, or one or more EGF-like (homologous) 
repeats of a Serrate protein, or any combination of the 
25 foregoing. 

Antibodies to Serrate, its derivatives and analogs, 

are additionally provided. 

As demonstrated infra, Serrate plays a critical 
role in development and other physiological processes, in 

30 particular, as a ligand to Notch, which is involved in cell 
fate (differentiation) determination. In particular, Serrate 
is believed to play a major role in determining cell fates in 
the central nervous system. The nucleic acid and amino acid 
sequences and antibodies thereto of the invention can be used 

35 for the detection and quantitation of Serrate mRNA and 
protein of human and other species, to study expression 
thereof, to produce Serrate and fragments and other 

- 8 - 
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derivatives and analogs thereof, in the study and 
manipulation of differentiation and other physiological 
processes. The present invention also relates to therapeutic 
and diagnostic methods and compositions based on Serrate 
5 proteins and nucleic acids. The invention provides for 
treatment of disorders of cell fate or differentiation by 
administration of a therapeutic compound of the invention. 
Such therapeutic compounds (termed herein "Therapeutics") 
include: vertebrate Serrate proteins and analogs and 
10 derivatives (including fragments) thereof; antibodies 
thereto; nucleic acids encoding the vertebrate Serrate 
proteins, analogs, or derivatives; and vertebrate Serrate 
antisense nucleic acids. In a preferred embodiment, a 
Therapeutic of the invention is administered to treat a 
15 cancerous condition, or to prevent progression from a pre- 
neoplastic or non-malignant state into a neoplastic or a 
malignant state. In other specific embodiments, a 
Therapeutic of the invention is administered to treat a 
nervous system disorder or to promote tissue regeneration and 
2 0 repair. 

in one embodiment, Therapeutics which antagonize, 
or inhibit, Notch and/or Serrate function (hereinafter 
"Antagonist Therapeutics") are administered for therapeutic 
effect. In another embodiment, Therapeutics which promote 
25 Notch and/or Serrate function (hereinafter "Agonist 

Therapeutics") are administered for therapeutic effect. 

Disorders of cell fate, in particular 
hyperproliferative (e.g., cancer) or hypoprolif erative 
disorders, involving aberrant or undesirable levels of 
30 expression or activity or localization of Notch and/or 

Serrate protein can be diagnosed by detecting such levels, as 
described more fully infra. 

In a preferred aspect, a Therapeutic of the 
invention is a protein consisting of at least a fragment 
35 (termed herein "adhesive fragment") of a vertebrate Serrate 
which mediates binding to a Notch protein or a fragment 
thereof . 
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The invention is illustrated by way of examples 
infra which disclose, inter alia, the cloning of a mouse 
Serrate homolog (Section 6), the cloning of a Xenopus (frog) 
Serrate homolog (Section 7), the cloning of a chick Serrate 
5 homolog (Section 8), and the cloning of the human Serrate 
homologs Human Serrate-1 {HJ1) and Human Serrate-2 {HJ2) 

(Section 9) . 

For clarity of disclosure, and not by way of 
limitation, the detailed description of the invention is 
10 divided into the sub-sections which follow. 

5.!. TSOLATION OF THE SF ^ATF. GENES 
The invention relates to the nucleotide sequences 
of vertebrate Serrate nucleic acids. In specific 
15 embodiments, vertebrate Serrate nucleic acids comprise the 
cDNA sequences shown in Figure 1 (SEQ ID NO:l), Figure 2 
(SEQ ID NO:3), Figure 3 (SEQ ID NO:6) or the coding regions 
thereof, or nucleic acids encoding a vertebrate Serrate 
protein (e.g., having the sequence of SEQ ID NO:2, 4, or 6). 
20 The invention provides nucleic acids consisting of 

at least 8 nucleotides (i.e., a hybridizable portion) of a 
vertebrate Serrate sequence; in other embodiments, the 
nucleic acids consist of at least 10 (continuous) 
nucleotides, 25 nucleotides, 50 nucleotides. 100 nucleotides, 
25 150 nucleotides, or 200 nucleotides of a vertebrate Serrate 
sequence, or a full-length vertebrate Serrate coding 
sequence. The invention also relates to nucleic acids 
hybridizable to or complementary to the foregoing sequences, 
in specific aspects, nucleic acids are provided which 
3 0 comprise a sequence complementary to at least 10, 25, 50, 
100, or 200 nucleotides or the entire coding region of a 

Serrate gene. . 

In a specific embodiment, a nucleic acid which is 

hybridizable to a vertebrate Serrate nucleic acid (e.g., 

cpft td NO- 11 or to a nucleic acid encoding a 
35 having sequence SEQ ID nu.ij, 

vertebrate Serrate derivative, under conditions of low 
stringency is provided. By way of example and not 
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limitation, procedures using such conditions of low 
stringency are as follows (see also Shilo and Weinberg, 1981, 
Proc. Natl. Acad. Sci. USA 78:6789-6792): Filters containing 
DNA are pretreated for 6 h at 40°C in a solution containing 
5 35% formamide, 5X SSC, 50 mM Tris-HCl (pH 7.5), 5 mM EDTA, 
0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 /ig/ml denatured salmon 
sperm DNA. Hybridizations are carried out in the same 
solution with the following modifications: 0.02% PVP, 0.02% 
Ficoll, 0.2% BSA, 100 *xg/ml salmon sperm DNA, 10% (wt/vol) 

10 dextran sulfate, and 5-20 X 10 6 cpm 32 P-labeled probe is used. 
Filters are incubated in hybridization mixture for 18-20 h at 
40°C, and then washed for 1.5 h at 55 °C in a solution 
containing 2X SSC, 25 mM Tris-HCl (pH 7.4), 5 mM EDTA, and 
0.1% SDS. The wash solution is replaced with fresh solution 

15 and incubated an additional 1.5 h at 60°C. Filters are 

blotted dry and exposed for autoradiography. If necessary, 
filters are washed for a third time at 65-68 °C and reexposed 
to film. Other conditions of low stringency which may be 
used are well known in the art (e.g., as employed for cross- 

20 species hybridizations) . 

In another specific embodiment, a nucleic acid 
which is hybridizable to a vertebrate Serrate nucleic acid 
under conditions of high stringency is provided. By way of 
example and not limitation, procedures using such conditions 

25 of high stringency are as follows: Prehybridizat ion of 

filters containing DNA is carried out for 8 h to overnight at 
65'C in buffer composed of 6X SSC, 50 mM Tris-HCl (pH 7.5), 1 
mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.02% BSA, and 500 ng/ml 
denatured salmon sperm DNA. Filters are hybridized for 48 h 

30 at 65°C in prehybridization mixture containing 100 ng/ra\ 
denatured salmon sperm DNA and 5-20 X 10* cpm of w P-labeled 
probe. Washing of filters is done at 37«C for 1 h in a 
solution containing 2X SSC, 0.01% PVP, 0.01% Ficoll, and 
0.01% BSA. This is followed by a wash in 0.1X SSC at 50°C 
35 for 45 min before autoradiography. Other conditions of high 
stringency which may be used are well known in the art. 
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Nucleic acids encoding fragments and derivatives of 
vertebrate Serrate proteins (see Section 5.6),. and vertebrate 
Serrate antisense nucleic acids (see Section 5.11) are 
additionally provided. As is readily apparent, as used 
5 herein, a "nucleic acid encoding a fragment or portion of a 
Serrate protein" shall be construed as referring to a nucleic 
acid encoding only the recited fragment or portion of the 
Serrate protein and not the other contiguous portions of the 
Serrate protein as a continuous sequence. 

10 Fragments of vertebrate Serrate nucleic acids 

comprising regions of homology to other toporythmic proteins 
are also provided. The DSL regions (regions of homology with 
Drosophila Delta and Serrate) of Serrate proteins of other 
species are also provided. Nucleic acids encoding conserved 

15 regions between Delta and Serrate, such as those represented 
by Serrate amino acids 63-73, 124-134, 149-158, 195-206, 214- 
219, and 250-259 of SEQ ID NO: 8, or by the DSL domains are 

also provided. 

Specific embodiments for the cloning of a 
20 vertebrate Serrate gene, presented as a particular example 
but not by way of limitation, follows: 

For expression cloning (a technique commonly known 
in the art), an expression library is constructed by methods 
known in the art. For example, mRNA (e.g., human) is 
25 isolated, cDNA is made and ligated into an expression vector 
(e.g., a bacteriophage derivative) such that it is capable of 
being expressed by the host cell into which it is then 
introduced. Various screening assays can then be used to 
select for the expressed Serrate product. In one embodiment, 
30 anti-Serrate antibodies can be used for selection. 

in another preferred aspect, PCR is used to amplify 
the desired sequence in a genomic or cDNA library, prior to 
selection. Oligonucleotide primers representing known 
Serrate sequences can be used as primers in PCR. In a 
35 preferred aspect, the oligonucleotide primers encode at least 
part of the Serrate conserved segments of strong homology 
between Serrate and Delta. The synthetic oligonucleotides 
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may be utilized as primers to amplify by PCR sequences from a 
source (RNA or DNA) , preferably a cDNA library, of potential 
interest. PCR can be carried out, e.g., by use of a Perkin- 
Elmer Cetus thermal cycler and Taq polymerase (Gene Amp*) . 
5 The DNA being amplified can include mRNA or cDNA or genomic 
DNA from any eukaryotic species. One can choose to 
synthesize several different degenerate primers, for use in 
the PCR reactions. It is also possible to vary the 
stringency of hybridization conditions used in priming the 
10 PCR reactions, to allow for greater or lesser degrees of 
nucleotide sequence similarity between the known Serrate 
nucleotide sequence and the nucleic acid homolog being 
isolated. For cross species hybridization, low stringency 
conditions are preferred. For same species hybridization, 
15 moderately stringent conditions are preferred. After 

successful amplification of a segment of a Serrate homolog, 
that segment may be cloned and sequenced, and utilized as a 
probe to isolate a complete cDNA or genomic clone. This, in 
turn, will permit the determination of the gene's complete 
20 nucleotide sequence, the analysis of its expression, and the 
production of its protein product for functional analysis, as 
described infra. In this fashion, additional genes encoding 
Serrate proteins may be identified. Such a procedure is 
presented by way of example in various examples sections 
2 5 infra. 

The above-methods are not meant to limit the 
following general description of methods by which clones of 
vertebrate Serrate may be obtained. 

Any vertebrate cell potentially can serve as the 

30 nucleic acid source for the molecular cloning of the Serrate 
gene. The nucleic acid sequences encoding Serrate can be 
isolated from human, porcine, bovine, feline, avian, equine, 
canine, as well as additional primate sources, etc. For 
example, we have amplified fragments of the appropriate size 

35 in mouse, Xenopus , and human, by PCR using cDNA libraries 
with Drosophila Serrate primers. The DNA may be obtained by 
standard procedures known in the art from cloned DNA (e.g., a 



- 13 - 



WO 96/27610 



PCT/US96/03172 



DNA "library") , by chemical synthesis, by cDNA cloning, or by 
the cloning of genomic DNA, or fragments thereof, purified 
from the desired cell. (See, for example, Sambrook et al., 
1989, Molecular Cloning, A Laboratory Manual, 2d Ed., Cold 
5 Spring Harbor Laboratory Press, Cold Spring Harbor, New York; 
Glover, D. M. (ed.), 1985, DNA Cloning: A Practical Approach, 
MRL Press, Ltd., Oxford, U.K. Vol. I, II.) Clones derived 
from genomic DNA may contain regulatory and intron DNA 
regions in addition to coding regions; clones derived from 

10 cDNA will contain only exon sequences. Whatever the source, 
the gene should be molecularly cloned into a suitable vector 
for propagation of the gene. 

In the molecular cloning of the gene from genomic 
DNA, DNA fragments are generated, some of which will encode 

15 the desired gene. The DNA may be cleaved at specific sites 
using various restriction enzymes. Alternatively, one may 
use DNAse in the presence of manganese to fragment the DNA, 
or the DNA can be physically sheared, as for example, by 
sonication. The linear DNA fragments can then be separated 

20 according to size by standard techniques, including but not 
limited to, agarose and polyacry lamide gel electrophoresis 
and column chromatography. 

Once the DNA fragments are generated, 
identification of the specific DNA fragment containing the 

25 desired gene may be accomplished in a number of ways. For 
example, if a Serrate (of any species) gene or its specific 
RNA, or a fragment thereof, e.g., an extracellular domain 
(see Section 5.6), is available and can be purified and 
labeled, the generated DNA fragments may be screened by 

30 nucleic acid hybridization to the labeled probe (Benton, W. 
and Davis, R. , 1977, Science 196:180; Grunstein, M. And 
Hogness, D. , 1975, Proc. Natl. Acad. Sci. U.S.A. 72:3961). 
Those DNA fragments with substantial homology to the probe 
will hybridize. It is also possible to identify the 

35 appropriate fragment by restriction enzyme digestion (s) and 
comparison of fragment sizes with those expected according to 
a known restriction map if such is available. Further 
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for cDNA cloning of the Serrate gene can be isolated from 
cells which express Serrate. Other methods are possible and 
within the scope of the invention. 

The identified and isolated gene can then be 
5 inserted into an appropriate cloning vector. A large number 
of vector-host systems known in the art may be used. 
Possible vectors include, but are not limited to, plasmids or 
modified viruses, but the vector system must be compatible 
with the host cell used. Such vectors include, but are not 

10 limited to, bacteriophages such as lambda derivatives, or 
plasmids such as PBR322 or pUC plasmid derivatives. The 
insertion into a cloning vector can, for example, be 
accomplished by ligating the DNA fragment into a cloning 
vector which has complementary cohesive termini. However, if 

15 the complementary restriction sites used to fragment the DNA 
are not present in the cloning vector, the ends of the DNA 
molecules may be enzymatically modified. Alternatively, any 
site desired may be produced by ligating nucleotide sequences 
(linkers) onto the DNA termini; these ligated linkers may 

20 comprise specific chemically synthesized oligonucleotides 

encoding restriction endonuclease recognition sequences. In 
an alternative method, the cleaved vector and Serrate gene 
may be modified by homopolymeric tailing. Recombinant 
molecules can be introduced into host cells via 

25 transformation, transf ection, infection, electroporation, 

etc., so that many copies of the gene sequence are generated. 

In an alternative method, the desired gene may be 
identified and isolated after insertion into a suitable 
cloning vector in a "shot gun" approach. Enrichment for the 

30 desired gene, for example, by size f ractionization, can be 
done before insertion into the cloning vector. 

In specific embodiments, transformation of host 
cells with recombinant DNA molecules that incorporate the 
isolated Serrate gene, cDNA, or synthesized DNA sequence 

35 enables generation of multiple copies of the gene. Thus, the 
gene may be obtained in large quantities by growing 
transf ormants, isolating the recombinant DNA molecules from 
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the transformants and, when necessary, retrieving the 
inserted gene from the isolated recombinant DNA. 

The Serrate sequences provided by the instant 
invention include those nucleotide sequences encoding 
S substantially the same amino acid sequences as found in 
native Serrate proteins, and those encoded amino acid 
sequences with functionally equivalent amino acids, all as 
described in Section 5.6 infra for Serrate derivatives. 

10 5 ' 2 - EXPRESSION OF TUP ce- pjmtk r ,^ F 

The nucleotide sequence coding for a vertebrate 
Serrate protein or a functionally active fragment or other 
derivative thereof (see Section 5.6), can be inserted into an 
appropriate expression vector, i.e., a vector which contains 
15 the necessary elements for the transcription and translation 
of the inserted protein-coding sequence. The necessary 
transcriptional and translational signals can also be 
supplied by the native vertebrate Serrate gene and/or its 
flanking regions. A variety of host-vector systems may be 
2 0 utilized to express the protein-coding sequence. These 
include but are not limited to mammalian cell systems 
infected with virus (e.g.. vaccinia virus, adenovirus, etc.)- 
insect cell systems infected with virus (e.g., baculovirus) ; 
microorganisms such as yeast containing yeast vectors, or 
25 bacteria transformed with bacteriophage, DNA, plasmid DNA, or 
cosmid DNA. The expression elements of vectors vary in 
their strengths and specificities. Depending on the host- 
vector system utilized, any one of a number of suitable 
transcription and translation elements may be used. in a 
30 specific embodiment, the adhesive portion of the Serrate gene 
is expressed. In other specific embodiments, a Human Serrate 
gene or a sequence encoding a functionally active portion of 
a human Serrate gene, such as Human Serrate- J (HJ2) or Human 
Serrate-2 (HJ2) , is expressed. In yet another embodiment, a 
35 fragment of Serrate comprising the extracellular domain, or 
other derivative, or analog of Serrate is expressed. 
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Any of the methods previously described for the 
insertion of DNA fragments into a vector may be used to 
construct expression vectors containing a chimeric gene 
consisting of appropriate transcriptional/translational 
5 control signals and the protein coding sequences. These 
methods may include in vitro recombinant DNA and synthetic 
techniques and in vivo recombinants (genetic recombination). 
Expression of nucleic acid sequence encoding a Serrate 
protein or peptide fragment may be regulated by a second 
10 nucleic acid sequence so that the Serrate protein or peptide 
is expressed in a host transformed with the recombinant DNA 
molecule. For example, expression of a Serrate protein may 
be controlled by any promoter/enhancer element known in the 
art. Promoters which may be used to control toporythmic gene 
15 expression include, but are not limited to, the SV40 early 
promoter region (Bernoist and Chambon, 1981, Nature 290:304- 
310), the promoter contained in the 3* long terminal repeat 
of Rous sarcoma virus (Yamamoto, et al., 1980, Cell 22:787- 
797), the herpes thymidine kinase promoter (Wagner et al., 
20 1981, Proc. Natl. Acad. Sci. U.S.A. 78:1441-1445), the 

regulatory sequences of the metallothionein gene (Brinster et 
al., 1982, Nature 296:39-42); prokaryotic expression vectors 
such as the ^-lactamase promoter (Villa-Kamarof f , et al., 
1978, Proc. Natl. Acad. Sci. U.S.A. 75:3727-3731), or the tac 
25 promoter (DeBoer, et al., 1983, Proc. Natl. Acad. Sci. U.S.A. 
80:21-25); see also "Useful proteins from recombinant 
bacteria" in Scientific American, 1980, 242:74-94; plant 
expression vectors comprising the nopaline synthetase 
promoter region (Herrera-Estrella et al. f Nature 303:209-213) 
30 or the cauliflower mosaic virus 35S RNA promoter (Gardner, et 
al., 1981, Nucl. Acids Res. 9:2871), and the promoter of the 
photosynthetic en2yme ribulose biphosphate carboxylase 
(Herrera-Estrella et al., 1984, Nature 310:115-120); promoter 
elements from yeast or other fungi such as the Gal 4 
35 promoter, the ADC (alcohol dehydrogenase) promoter, PGK 
(phosphoglycerol kinase) promoter, alkaline phosphatase 
promoter, and the following animal transcriptional control 
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regions, which exhibit tissue specificity and have been 
utilized in transgenic animals: elastase I gene control 
region which is active in pancreatic acinar cells (Swift et 
al 19B4 , cell 38:639-646; Ornitz et al., 198 6, Cold Spring 
5 Harbor Symp. Qua nt. Biol. 50:399-409; MacDonald, 198 7 
Hepatology 7:425-515, ; insulin gene control reg . on ^ 
active in pancreatic beta cells (Hanahan, !985, Nature 
315:115-122), immunoglobulin gene control region which is 
active in lymphoid cells (Grosschedl et al., 19 84 Cell 
10 38:647-658; Adames et al., a9 85, Nature 318:533-538 

Alexander et al., 19 87. Mol. Cell. Biol. 7:1436-1444), mouse 
mammary tumor virus control region which is active in 
testicular, breast, lymphoid and mast cells (Leder et al 

15 acti' Cel V 5:485_495) ' albUmin gene cont "l "gion which'is 
15 active in liver (Pinkert et al., 19 87, Genes and Devel. 

1:268-276), alpha-fetoprotein gene control region which is 
active in liver (Krumlauf et al., 1985, Mol. Cell. Biol 
5:1639-1648; Hammer et al., 1987, Science 235:53-58; alpha 1- 
antitrypsin gene control region which is active in the liver 
20 (Kelsey et al., 1987, Genes and Devel. 1: 161-171), beta- 
globin gene control region which is active in myeloid cells 
(Mogram et al., i 985 , Nature 315:338-340; Kollias et al 
1986 cell 46:89-94; myelin basic protein gene control region 
which is active in oligodendrocyte cells in the brain 
25 (Readhead et al.. 19 87, Cell 48 : 703-712, ; myosin light chain- 
2 gene control region which is active in skeletal muscle 
(Sam, 19 85, Nature 314:283-286), and gonadotropic releasing 
hormone gene control region which is active in the 
hypothalamus (Mason et al., i 98 6. Science 234 : 1372-1378) . 
30 Expression vectors containing Serrate gene inserts 

can be identified by three general approaches: (a) nucleic 
acid hybridization, (b) presence or absence of "marker" gene 
functions, and (c) expression of inserted sequences. m the 
first approach, the presence of a foreign gene inserted in an 
35 expression vector can be detected by nucleic acid 

hybridization using probes comprising sequences that are 
homologous to an inserted toporythmic gene. m the second 
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approach, the recombinant vector /host system can be 
identified and selected based upon the presence or absence of 
certain "marker" gene functions (e.g., thymidine kinase 
activity, resistance to antibiotics, transformation 
5 phenotype, occlusion body formation in baculovirus, etc.) 
caused by the insertion of foreign genes in the vector. For 
example, if the Serrate gene is inserted within the marker 
gene sequence of the vector, recombinants containing the 
Serrate insert can be identified by the absence of the marker 
10 gene function. In the third approach, recombinant expression 
vectors can be identified by assaying the foreign gene 
product expressed by the recombinant. such assays can be 
based, for example, on the physical or functional properties 
of the Serrate gene product in vitro assay systems, e.g., 
15 aggregation (binding) with Notch, binding to a receptor, 
binding with antibody. 

Once a particular recombinant DNA molecule is 
identified and isolated, several methods known in the art may 
be used to propagate it. Once a suitable host system and 
20 growth conditions are established, recombinant expression 
vectors can be propagated and prepared in quantity. As 
previously explained, the expression vectors which can be 
used include, but are not limited to, the following vectors 
or their derivatives: human or animal viruses such as 
25 vaccinia virus or adenovirus; insect viruses such as 

baculovirus; yeast vectors; bacteriophage vectors (e.g., 
lambda), and plasmid and cosmid DNA vectors, to name but a 
few. 

In addition, a host cell strain may be chosen which 
30 modulates the expression of the inserted sequences, or 
modifies and processes the gene product in the specific 
fashion desired. Expression from certain promoters can be 
elevated in the presence of certain inducers; thus, 
expression of the genetically engineered Serrate protein may 
35 be controlled. Furthermore, different host cells have 

characteristic and specific mechanisms for the translational 
and post-translational processing and modification (e.g., 
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glycosylate, cleavage [e.g., of signal sequence]) of 
proteins. Appropriate cell lines or host systems can be 
chosen to ensure the desired modification and processing of 
the foreign protein expressed. For example, expression in a 
5 bacterial system can be used to produce an unglycosylated 
core protein product. Expression in yeast will produce a 
glycosylated product. Expression in mammalian cells can be 
used to ensure "native- glycosylate of a heterologous 
mammalian toporythmic protein. Furthermore, different 
10 vector/host expression systems may effect processing 

reactions such as proteolytic cleavages to different extents. 

In other specific embodiments, the Serrate protein 
fragment, analog, or derivative may be expressed as a fusion' 
or chimeric protein product (comprising the protein, 
15 fragment, analog, or derivative joined via a peptide bond to 
a heterologous protein sequence (of a different protein)). 
Such a chimeric product can be made by ligating the 
appropriate nucleic acid sequences encoding the desired amino 
ac ld sequences to each other by methods known in the art, in 
2 0 the proper coding frame, and expressing the chimeric product 
by methods commonly known in the art. Alternatively, such a 
chimeric product may be made by protein synthetic techniques, 
e.g., by use of a peptide synthesizer. 

Both cDNA and genomic sequences can be cloned and 

25 expressed. 



5.3. IDENTIFICATION AND PURIFICATION 
OF THE SERRATE r.ij ME PRonnrT? 

In particular aspects, the invention provides amino 

30 acid sequences of a vertebrate Serrate, preferably a human 

Serrate homolog, and fragments and derivatives thereof which 

comprise an antigenic determinant (i.e., can be recognized by 

an antibody) or which are otherwise functionally active, as 

well as nucleic acid sequences encoding the foregoing. 

"Functionally active" material as used herein refers to that 

material displaying one or more known functional activities 

associated with a full-length (wild-type) Serrate protein, 



35 
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e.g., binding to Notch or a portion thereof, binding to any 
other Serrate ligand, antigenicity (binding to an anti- 
Serrate antibody) , etc. 

In specific embodiments, the invention provides 
5 fragments of a vertebrate Serrate protein consisting of at 
least 6 amino acids, 10 amino acids, 25 amino acids, 50 amino 
acids, or of at least 75 amino acids. In other embodiments, 
the proteins comprise or consist essentially of an 
extracellular domain, DSL domain, epidermal growth factor- 

10 like repeat (ELR) domain, one or any combination of ELRs, 

cysteine-rich region, transmembrane domain, or intracellular 
(cytoplasmic) domain, or a portion which binds to Notch, or 
any combination of the foregoing, of a Serrate protein. 
Fragments, or proteins comprising fragments, lacking some or 

15 all of the foregoing regions of a vertebrate Serrate protein 
are also provided. Nucleic acids encoding the foregoing are 
provided . 

Once a recombinant which expresses the vertebrate 
Serrate gene sequence is identified, the gene product can be 

20 analyzed. This is achieved by assays based on the physical 
or functional properties of the product, including 
radioactive labelling of the product followed by analysis by 
gel electrophoresis, immunoassay, etc. 

Once the Serrate protein is identified, it may be 

25 isolated and purified by standard methods including 

chromatography (e.g., ion exchange, affinity, and sizing 
column chromatography) , centrif ugation , differential 
solubility, or by any other standard technique for the 
purification of proteins. The functional properties may be 

30 evaluated using any suitable assay (see Section 5.7). 

Alternatively, once a Serrate protein produced by a 
recombinant is identified, the amino acid sequence of the 
protein can be deduced from the nucleotide sequence of the 
chimeric gene contained in the recombinant. As a result, the 

35 protein can be synthesized by standard chemical methods known 
in the art (e.g., see Hunkapiller, M. , et al., 1984, Nature 
310: 105-111) . 
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In a specific embodiment of the present invention, 
such Serrate proteins, whether produced by recombinant DNA 
techniques or by chemical synthetic methods, include but are 
not limited to those containing, as a primary amino acid 
5 sequence, all or part of the amino acid sequence 

substantially as depicted in Figures l, 2, or 3 (SEQ ID NO: 2, 
4, or 6, respectively), as well as fragments and other 
derivatives, and analogs thereof. 

10 5.4. STRUCTURE OF THE SE RRATE GENES AND PROTFT^ 

The structure of the Serrate genes and proteins can 
be analyzed by various methods known in the art. 

5.4.1. GENETIC ANALYSIS 
15 The cloned DNA or cDNA corresponding to the 

vertebrate Serrate gene can be analyzed by methods including 
but not limited to Southern hybridization (Southern, E.M. , 
1975, J. Mol. Biol. 98:503-517), Northern hybridization (see 
e.g., Freeman et al., 1983, Proc. Natl. Acad. Sci. U.S.A. 
20 80:4094-4098), restriction endonuclease mapping (Maniatis, 
T., 1982, Molecular Cloning, A Laboratory, Cold Spring 
Harbor, New York), and DNA sequence analysis. Polymerase 
chain reaction (PCR; U.S. Patent Nos. 4,683,202, 4,683,195 
and 4,889,818; Gyllenstein et al., 1988, Proc. Natl. Acad. 
25 Sci. U.S.A. 85:7652-7656; Ochman et al., 1988, Genetics 

120:621-623; Loh et al., 1989, Science 243:217-220) followed 
by Southern hybridization with a Serrate-specific probe can 
allow the detection of the Serrate gene in DNA from various 
cell types. Methods of amplification other than PCR are 
30 commonly known and can also be employed. In one embodiment, 
Southern hybridization can be used to determine the genetic 
linkage of Serrate. Northern hybridization analysis can be 
used to determine the expression of the Serrate gene. 
Various cell types, at various states of development or 
35 activity can be tested for Serrate expression. Examples of 
such techniques and their results are described in Section 6, 
infra. The stringency of the hybridization conditions for 
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both Southern and Northern hybridization can be manipulated 
to ensure detection of nucleic acids with the desired degree 
of relatedness to the specific Serrate probe used. 

Restriction endonuclease mapping can be used to 
5 roughly determine the genetic structure of the Serrate gene. 
In a particular embodiment, cleavage with restriction enzymes 
can be used to derive the restriction map shown in Figure 2, 
infra. Restriction maps derived by restriction endonuclease 
cleavage can be confirmed by DNA sequence analysis. 

10 DNA sequence analysis can be performed by any 

techniques known in the art, including but not limited to the 
method of Maxam and Gilbert (1980, Meth. Enzymol. 65:499- 
560), the Sanger dideoxy method (Sanger, F. , et al., 1977, 
Proc. Natl. Acad. Sci. U.S.A. 74:5463), the use of T7 DNA 

15 polymerase (Tabor and Richardson, U.S. Patent No. 4,795,699), 
or use of an automated DNA sequenator (e.g., Applied 
Biosysteros, Foster City, CA) . The cDNA sequence of a 
representative Serrate gene comprises the sequence 
substantially as depicted in Figures 1 and 2, and is 

20 described in Section 9, infra. 

5.4.2. PROTEIN ANALYSIS 
The amino acid sequence of the Serrate proteins can 
be derived by deduction from the DNA sequence, or 

25 alternatively, by direct sequencing of the protein, e.g., 
with an automated amino acid sequencer. The amino acid 
sequence of a representative Serrate protein comprises the 
sequence substantially as depicted in Figure 1, and detailed 
in Section 9, infra, with the representative mature protein 

30 that shown by amino acid numbers 30-1219. 

The Serrate protein sequence can be further 
characterized by a hydrophilicity analysis (Hopp, T. and 
Woods, K., 1981, Proc. Natl. Acad. Sci. U.S.A. 78:3824). A 
hydrophilicity profile can be used to identify the 

35 hydrophobic and hydrophilic regions of the Serrate protein 
and the corresponding regions of the gene sequence which 
encode such regions. 
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Secondary, structural analysis (Chou, P. and 
Fasman, G . , 1974, Biochemistry 13:222) can also be done, to 
identify regions of Serrate that assume specific secondary 
structures . 

5 Manipulation, translation, and secondary structure 

prediction, as well as open reading frame prediction and 
plotting, can also be accomplished using computer software 
programs available in the art. 

Other methods of structural analysis can also be 

10 employed. These include but are not limited to X-ray 

crystallography (Engstom, A., 1974, Biochem. Exp. Biol. 11:7- 
13) and computer modeling (Fletterick, R. and Zoller, M. 
(eds.), 1986, Computer Graphics and Molecular Modeling, in 
Current Communications in Molecular Biology, Cold Spring 

15 Harbor Laboratory, Cold Spring Harbor, New York) . 

5.5. GENERATION OF ANTIBODIES TO SERRATE 
PROTEINS AND DERIVATIVES THEREOF 

According to the invention, a vertebrate Serrate 

20 protein, its fragments or other derivatives, or analogs 

thereof, may be used as an immunogen to generate antibodies 

which recognize such an immunogen. Such antibodies include 

but are not limited to polyclonal, monoclonal, chimeric, 

single chain, Fab fragments, and an Fab expression library. 

25 In a specific embodiment, antibodies to human Serrate are 

produced. In another embodiment, antibodies to the 

extracellular domain of Serrate are produced. In another 

embodiment, antibodies to the intracellular domain of Serrate 

are produced. 

30 Various procedures known in the art may be used for 

the production of polyclonal antibodies to a Serrate protein 
or derivative or analog. In a particular embodiment, rabbit 
polyclonal antibodies to an epitope of the Serrate protein 
encoded by a sequence depicted in Figure 1, or a subsequence 

35 thereof, can be obtained. For the production of antibody, 
various host animals can be immunized by injection with the 
native Serrate protein, or a synthetic version, or derivative 
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(e.g., fragment) thereof, including but not limited to 
rabbits, mice, rats, etc. Various adjuvants may be used to 
increase the immunological response, depending on the host 
species, and including but not limited to Freund's (complete 
5 and incomplete), mineral gels such as aluminum hydroxide, 
surface active substances such as lysolecithin, pluronic 
polyols, polyanions, peptides, oil emulsions, keyhole limpet 
hemocyanins, dinitrophenol , and potentially useful human 
adjuvants such as BCG (bacille Calmette-Guerin) and 
10 corynebacterium parvum. 

For preparation of monoclonal antibodies directed 
toward a vertebrate Serrate protein seguence or analog 
thereof, any technique which provides for the production of 
antibody molecules by continuous cell lines in culture may be 
15 used. For example, the hybridoma technique originally 

developed by Kohler and Milstein (1975, Nature 256:495-497), 
as well as the trioma technique, the human B-cell hybridoma 
technique (Kozbor et al., 1983, Immunology Today 4:72), and 
the EBV-hybridoma technique to produce human monoclonal 
20 antibodies (Cole et al., 1985, in Monoclonal Antibodies and 
Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). In an 
additional embodiment of the invention, monoclonal antibodies 
can be produced in germ-free animals utilizing recent 
technology (PCT/US90/02545) . According to the invention, 
25 human antibodies may be used and can be obtained by using 
human hybridomas (Cote et al., 1983, Proc. Natl. Acad. Sci. 
U.S.A. 80:2026-2030) or by transforming human B cells with 
EBV virus in vitro (Cole et al., 1985, in Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, pp. 77-96). in 
30 fact, according to the invention, techniques developed for 
the production of "chimeric antibodies" (Morrison et al., 
1984, Proc. Natl. Acad. Sci. U.S.A. 81:6851-6855; Neuberger 
et al., 1984, Nature 312:604-608; Takeda et al., 1985, Nature 
314:452-454) by splicing the genes from a mouse antibody 
35 molecule specific for Serrate together with genes from a 
human antibody molecule of appropriate biological activity 
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can be used; such antibodies are within the scope of this 
invention. 

According to the invention, techniques described 
for the production of single chain antibodies (U.S. Patent 
5 4,946,778) can be adapted to produce Serrate-specific single 
chain antibodies. An additional embodiment of the invention 
utilizes the techniques described for the construction of Fab 
expression libraries (Huse et al., 1989, Science 246:1275- 
1281) to allow rapid and easy identification of monoclonal 

10 Fab fragments with the desired specificity for Serrate 
proteins, derivatives, or analogs. 

Antibody fragments which contain the idiotype of 
the molecule can be generated by known techniques. For 
example, such fragments include but are not limited to: the 

15 F(ab') 2 fragment which can be produced by pepsin digestion of 
the antibody molecule; the Fab' fragments which can be 
generated by reducing the disulfide bridges of the F(ab') 2 
fragment, and the Fab fragments which can be generated by 
treating the antibody molecule with papain and a reducing 

2 0 agent. 

In the production of antibodies, screening for the 
desired antibody can be accomplished by techniques known in 
the art, e.g. ELISA (enzyme-linked immunosorbent assay) . For 
example, to select antibodies which recognize a specific 

25 domain of a Serrate protein, one may assay generated 

hybridomas for a product which binds to a Serrate fragment 
containing such domain. For selection of an antibody 
specific to vertebrate (e.g., human) Serrate, one can select 
on the basis of positive binding to vertebrate Serrate and a 

30 lack of binding to Drosophila Serrate. In another 

embodiment, one can select for binding to human Serrate and 
not to Serrate of other species. 

The foregoing antibodies can be used in methods 
known in the art relating to the localization and activity of 

35 the protein sequences of the invention (e.g., see Section 
5.7, infra), e.g., for imaging these proteins, measuring 
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levels thereof in appropriate physiological samples, in 
diagnostic methods, etc. 

Antibodies specific to a domain of a Serrate 
protein are also provided. In a specific embodiment, 
5 antibodies which bind to a Notch-binding fragment of Serrate 
are provided. 

In another embodiment of the invention (see infra) , 
anti-Serrate antibodies and fragments thereof containing the 
binding domain are Therapeutics. 

10 

5 - 6 - SERRATE PROTEINS, DERIVATIVES AND ANAT,OCfi 

The invention further relates to vertebrate Serrate 
proteins, and derivatives (including but not limited to 
fragments) and analogs of Serrate proteins. Nucleic acids 
15 encoding vertebrate Serrate protein derivatives and protein 
analogs are also provided. In one embodiment, the Serrate 
proteins are encoded by the vertebrate Serrate nucleic acids 
described in Section 5.1 supra. In particular aspects, the 
proteins, derivatives, or analogs are of frog, mouse, rat, 
2 0 pig, cow, dog, monkey, or human Serrate proteins. 

The production and use of derivatives and analogs 
related to vertebrate Serrate are within the scope of the 
present invention. In a specific embodiment, the derivative 
or analog is functionally active, i.e., capable of exhibiting 
25 one or more functional activities associated with a full- 
length, wild-type Serrate protein. As one example, such 
derivatives or analogs which have the desired immunogenicity 
or antigenicity can be used, for example, in immunoassays, 
for immunization, for inhibition of Serrate activity, etc. 
30 Such molecules which retain, or alternatively inhibit, a 
desired Serrate property, e.g., binding to Notch or other 
toporythmic proteins, binding to a cell-surface receptor, can 
be used as inducers, or inhibitors, respectively, of such 
property and its physiological correlates. A specific 
35 embodiment relates to a Serrate fragment that can be bound by 
an anti-Serrate antibody but cannot bind to a Notch protein 
or other toporythmic protein. Derivatives or analogs of 
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Serrate can be tested for the desired activity by procedures 
known in the art, including but not limited to the assays 
described in Section 5.7. 

In particular, Serrate derivatives can be made by 
5 altering Serrate sequences by substitutions, additions or 
deletions that provide for functionally equivalent molecules. 
Due to the degeneracy of nucleotide coding sequences, other 
DNA sequences which encode substantially the same amino acid 
sequence as a Serrate gene may be used in the practice of the 

10 present invention. These include but are not limited to 
nucleotide sequences comprising all or portions of Sex-rate 
genes which are altered by the substitution of different 
codons that encode a functionally equivalent amino acid 
residue within the sequence, thus producing a silent change. 

15 Likewise, the Serrate derivatives of the invention include, 
but are not limited to, those containing, as a primary amino 
acid sequence, all or part of the amino acid sequence of a 
Serrate protein including altered sequences in which 
functionally equivalent amino acid residues are substituted 

20 for residues within the sequence resulting in a silent 

change. For example, one or more amino acid residues within 
the sequence can be substituted by another amino acid of a 
similar polarity which acts as a functional equivalent, 
resulting in a silent alteration. Substitutes for an amino 

25 acid within the sequence may be selected from other members 
of the class to which the amino acid belongs. For example, 
the nonpolar (hydrophobic) amino acids include alanine, 
leucine, isoleucine, valine, proline, phenylalanine, 
tryptophan and methionine. The polar neutral amino acids 

30 include glycine, serine, threonine, cysteine, tyrosine, 
asparagine, and glutamine. The positively charged (basic) 
amino acids include arginine, lysine and histidine. The 
negatively charged (acidic) amino acids include aspartic acid 
and glutamic acid. 

35 in a specific embodiment of the invention, proteins 

consisting of or comprising a fragment of a vertebrate 
Serrate protein consisting of at least 10 (continuous) amino 
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acids of the Serrate protein is provided. In, other 
embodiments, the fragment consists of at least 20 or 50 amino 
acids of the Serrate protein. m specific embodiments, such 
fragments are not larger than 35, 100 or 200 amino acids. 
5 Derivatives or analogs of vertebrate Serrate include but are 
not limited to those peptides which are substantially 
homologous to a vertebrate Serrate or a fragment thereof 
(e.g., at least 30% identity over an amino acid sequence of 
identical size) or whose encoding nucleic acid is capable of 
10 hybridizing to a coding vertebrate Serrate sequence. 

The Serrate derivatives and analogs of the 
invention can be produced by various methods known in the 
art. The manipulations which result in their production can 
occur at the gene or protein level. For example, the cloned 
15 Serrate gene sequence can be modified by any of numerous 
strategies known in the art (Maniatis, T. , 1990, Molecular 
Cloning, A Laboratory Manual, 2d ed. , Cold Spring Harbor 
Laboratory, Cold Spring Harbor, New York). The sequence can 
be cleaved at appropriate sites with restriction 
20 endonuclease(s), followed by further enzymatic modification 
if desired, isolated, and ligated in vitro. in the 
production of the gene encoding a derivative or analog of 
Serrate, care should be taken to ensure that the modified 
gene remains within the same translational reading frame as 
25 Serrate, uninterrupted by translational stop signals, in the 
gene region where the desired Serrate activity is encoded. 

Additionally, the Serrate-encoding nucleic acid 
sequence can be mutated in vitro or in vivo f to create and/or 
destroy translation, initiation, and/or termination 
30 sequences, or to create variations in coding regions and/or 
form new restriction endonuclease sites or destroy 
preexisting ones, to facilitate further in vitro 
modification. Any technique for mutagenesis known in the art 
can be used, including but not limited to, in vitro site- 
35 directed mutagenesis (Hutchinson, C. , et al., 1978, J. Biol. 
Chero 253:6551), use of TAB linkers (Pharmacia), etc. 
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Manipulations of the Serrate sequence may also be 
made at the protein level. Included within the scope of the 
invention are Serrate protein fragments or other derivatives 
or analogs which are differentially modified during or after 
5 translation, e.g., by glycosylation , acetylation, 
phosphorylation, amidation, derivatization by known 
protecting/blocking groups, proteolytic cleavage, linkage to 
an antibody molecule or other cellular ligand, etc. Any of 
numerous chemical modifications may be carried out by known 

10 techniques, including but not limited to specific chemical 
cleavage by cyanogen bromide, trypsin, chymotrypsin, papain, 
V8 protease, NaBH 4 ; acetylation, formylation, oxidation, 
reduction; metabolic synthesis in the presence of 
tunicamycin; etc. 

15 In addition, analogs and derivatives of Serrate can 

be chemically synthesized. For example, a peptide 
corresponding to a portion of a Serrate protein which 
comprises the desired domain (see Section 5.6.1), or which 
mediates the desired aggregation activity in vitro, or 

20 binding to a receptor, can be synthesized by use of a peptide 
synthesizer. Furthermore, if desired, nonclassical amino 
acids or chemical amino acid analogs can be introduced as a 
substitution or addition into the Serrate sequence. Non- 
classical amino acids include but are not limited to the D- 

25 isomers of the common amino acids, a-amino isobutyric acid, 
4-aminobutyric acid, hydroxyproline, sarcosine, citrulline, 
cysteic acid, t-butylglycine, t-butylalanine , phenylglycine , 
cyclohexylalanine, 0-alanine, designer amino acids such as @- 
methyl amino acids, Ca-methyl amino acids, and Ncr-methyl 

30 amino acids. 

In a specific embodiment, the Serrate derivative is 
a chimeric, or fusion, protein comprising a vertebrate 
Serrate protein or fragment thereof (preferably consisting of 
at least a domain or motif of the Serrate protein, or at 

35 least 10 amino acids of the Serrate protein) joined at its 
amino- or carboxy-terminus via a peptide bond to an amino 
acid sequence of a different protein. In one embodiment, 
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such a chimeric protein is produced by recombinant expression 
of a nucleic acid encoding the protein (comprising a Serrate- 
coding sequence joined in-frame to a coding sequence for a 
different protein) . Such a chimeric product can be made by 
5 ligating the appropriate nucleic acid sequences encoding the 
desired amino acid sequences to each other by methods known 
in the art, in the proper coding frame, and expressing the 
chimeric product by methods commonly known in the art. 
Alternatively, such a chimeric product may be made by protein 

10 synthetic techniques, e.g., by use of a peptide synthesizer. 
In a specific embodiment, a chimeric nucleic acid encoding a 
mature vertebrate Serrate protein with a heterologous signal 
sequence is expressed such that the chimeric protein is 
expressed and processed by the cell to the mature Serrate 

15 protein. As another example, and not by way of limitation, a 
recombinant molecule can be constructed according to the 
invention, comprising coding portions of both Serrate and 
another toporythmic gene, e.g., Delta. The encoded protein 
of such a recombinant molecule could exhibit properties 

2 0 associated with both Serrate and Delta and portray a novel 
profile of biological activities, including agonists as well 
as antagonists. The primary sequence of Serrate and Delta 
may also be used to predict tertiary structure of the 
molecules using computer simulation (Hopp and Woods, 1981, 

25 Proc. Natl. Acad. Sci. U.S.A. 78:3824-3828); Serrate /Delta 
chimeric recombinant genes could be designed in light of 
correlations between tertiary structure and biological 
function. Likewise, chimeric genes comprising portions of a 
vertebrate Serrate fused to any heterologous protein-encoding 

30 sequences may be constructed. A specific embodiment relates 
to a chimeric protein comprising a fragment of a vertebrate 
Serrate of at least ten amino acids. 

In another specific embodiment, the Serrate 
derivative is a fragment of Serrate comprising a region of 

35 homology with another toporythmic protein. As used herein, a 
region of a first protein shall be considered "homologous" to 
a second protein when the amino acid sequence of the region 
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is at least 30% identical or at least 75% either identical oj 
involving conservative changes, when compared to any sequence 
in the second protein of an equal number of amino acids as 
the number contained in the region. For example, such a 
S Serrate fragment can comprise one or more regions homologous 
to Delta, or DSL domains or portions thereof. 

Other specific embodiments of derivatives and 
analogs are described in the subsections below and examples 
sections infra, 

i 

5.6.1. DERIVATIVES OF SERRATE CONTAINING 
ONE OR MORE DOMAINS OF THE PROTFTfrJ 

In a specific embodiment, the invention relates to 
vertebrate Serrate derivatives and analogs, in particular 
vertebrate Serrate fragments and derivatives of such 
fragments, that comprise, or alternatively consist of, one or 
more domains of the Serrate protein, including but not 
limited to the extracellular domain, DSL domain, ELR domain, 
cysteine rich domain, transmembrane domain, intracellular 
domain, membrane-associated region, and one or more of the 
EGF-like repeats (ELR) of the Serrate protein, or any 
combination of the foregoing. In particular examples 
relating to the human and chick Serrate proteins, such 
domains are identified in Examples Section 9 and 8, 
respectively. 

In a specific embodiment, the molecules comprising 
specific fragments of vertebrate Serrate are those comprising 
fragments in the respective Serrate protein most homologous 
to specific fragments of the Drosophila Serrate and/or Delta 
proteins. In particular embodiments, such a molecule 
comprises or consists of the amino acid sequences homologous 
to SEQ ID NO: 10, 12, or 18. Alternatively, a fragment 
comprising a domain of a Serrate homolog can be identified by 
protein analysis methods as described in Section 5.3.2. 
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5.6.2. DERIVATIVES OF SERRATE THAT MEDIATE 

BINDING TO TOPORYTH MIC PROTFTN DOMATNfi 

The invention also provides for vertebrate Serrate 
fragments, and analogs or derivatives of such fragments, 
5 which mediate binding to toporythmic proteins (and thus'are 
termed herein "adhesive"), and nucleic acid sequences 
encoding the foregoing. 

In a specific embodiment, the adhesive fragment of 
Serrate is that comprising the portion of Serrate most 
10 homologous to about amino acid numbers 85-283 or 79-282 of 
the Drosophila Serrate sequence (see PCT Publication 
WO 93/12141 dated June 24, 1993). 

In a particular embodiment, the adhesive fragment 
of a Serrate protein comprises the DSL domain, or a portion 
15 thereof. Subfragments within the DSL domain that mediate 
binding to Notch can be identified by analysis of constructs 
expressing deletion mutants. 

The ability to bind to a toporythmic protein 
(preferably Notch) can be demonstrated by in vitro 
20 aggregation assays with cells expressing such a toporythmic 
protein as well as cells expressing Serrate or a Serrate 
derivative (See Section 5.7). That is, the ability of a 
Serrate fragment to bind to a Notch protein can be 
demonstrated by detecting the ability of the Serrate 
25 fragment, when expressed on the surface of a first cell, to 
bind to a Notch protein expressed on the surface of a second 
cell. 

The nucleic acid sequences encoding toporythmic 
proteins or adhesive domains thereof, for use in such assays, 
30 can be isolated from human, porcine, bovine, feline, avian, 
equine, canine, or insect, as well as primate sources and any 
other species in which homologs of known toporythmic genes 
can be identified. 



35 
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5.7. ASSAYS OF SERRATE PROTEINS, 
DERIVATIVES AND ANALOGS 

The functional activity of vertebrate Serrate 
proteins, derivatives and analogs can be assayed by various 
methods . 

For example, in one embodiment, where one is 
assaying for the ability to bind or compete with wild-type 
Serrate for binding to anti-Serrate antibody, various 
immunoassays known in the art can be used, including but not 
limited to competitive and non-competitive assay systems 
using techniques such as radioimmunoassays, ELISA (enzyme 
linked immunosorbent assay) , "sandwich" immunoassays, 
immunoradiometric assays, gel diffusion precipitin reactions, 
immunodiffusion assays, in situ immunoassays (using colloidal 
gold, enzyme or radioisotope labels, for example), western 
blots, precipitation reactions, agglutination assays (e.g., 
gel agglutination assays, hemagglutination assays) , 
complement fixation assays, immunofluorescence assays, 
protein A assays, and immunoelectrophoresis assays, etc. In 
one embodiment, antibody binding is detected by detecting a 
label on the primary antibody. In another embodiment, the 
primary antibody is detected by detecting binding of a 
secondary antibody or reagent to the primary antibody. In a 
further embodiment, the secondary antibody is labeled. Many 
means are known in the art for detecting binding in an 
immunoassay and are within the scope of the present 
invention. 

In another embodiment, where one is assaying for 
the ability to mediate binding to a toporythmic protein, 
e.g., Notch, one can carry out an in vitro aggregation assay 
such as described in PCT Publication WO 93/12141 dated June 
24, 1993 (see also Fehon et al., 1990, Cell 61:523-534; Rebay 
et al., 1991, Cell 67:687-699). 

In another embodiment, where a receptor for Serrate 
is identified, receptor binding can be assayed, e.g., by 
means well-known in the art. In another embodiment, 
physiological correlates of Serrate binding to cells 
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expressing a Serrate receptor (signal transduction) can be 
assayed. 

In another embodiment, in insect or other model 
systems, genetic studies can be done to study the phenotypic 
5 effect of a Serrate mutant that is a derivative or analog of 
wild-type vertebrate Serrate. 

Other methods will be known to the skilled artisan 
and are within the scope of the invention. 

X0 5.8. THERAPEUTIC USES 

The invention provides for treatment of disorders 
of cell fate or differentiation by administration of a 
therapeutic compound of the invention. such therapeutic 
compounds (termed herein "Therapeutics") include: vertebrate 
15 Serrate proteins and analogs and derivatives (including 
fragments) thereof (e.g., as described hereinabove); 
antibodies thereto (as described hereinabove); nucleic acids 
encoding the vertebrate Serrate proteins, analogs, or 
derivatives (e.g., as described hereinabove); and Serrate 
20 antisense nucleic acids. As stated supra, the Antagonist 
Therapeutics of the invention are those Therapeutics which 
antagonize, or inhibit, a vertebrate Serrate function and/or 
Notch function (since Serrate is a Notch ligand) . Such 
Antagonist Therapeutics are most preferably identified by use 
25 of known convenient in vitro assays, e.g., based on their 
ability to inhibit binding of Serrate to another protein 
(e.g., a Notch protein), or inhibit any known Notch or 
Serrate function as preferably assayed in vitro or in cell 
culture, although genetic assays (e.g., in Drosophila) may 
30 also be employed. In a preferred embodiment, the Antagonist 
Therapeutic is a protein or derivative thereof comprising a 
functionally active fragment such as a fragment of Serrate 
which mediates binding to Notch, or an antibody thereto. In 
other specific embodiments, such an Antagonist Therapeutic is 
35 a nucleic acid capable of expressing a molecule comprising a 
fragment of Serrate which binds to Notch, or a Serrate 
antisense nucleic acid (see Section 5.11 herein). It should 
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be noted that preferably, suitable in vitro or in vivo 
assays, as described infra, should be utilized to determine 
the effect of a specific Therapeutic and whether its 
administration is indicated for treatment of the affected 
5 tissue, since the developmental history of the tissue may 
determine whether an Antagonist or Agonist Therapeutic is 
desired. 

In addition, the mode of administration, e.g 
whether administered in soluble form or administered via its 
10 encoding nucleic acid for intracellular recombinant 

expression, of the Serrate protein or derivative can affect 
whether it acts as an agonist or antagonist. 

In another embodiment of the invention, a nucleic 
acid containing a portion of a vertebrate Serrate gene is 
IS used, as an Antagonist Therapeutic, to promote Serrate 
mactivation by homologous recombination (Koller and 
Smithies, 1989, Proc. Natl. Acad. Sci . USA 86:8932-8935; 
Zijlstra et al., 1989, Nature 342:435-438). 

The Agonist Therapeutics of the invention, as 
20 described supra, promote Serrate function. Such Agonist 
Therapeutics include but are not limited to proteins and 
derivatives comprising the portions of Notch that mediate 
binding to Serrate, and nucleic acids encoding the foregoing 
(which can be administered to express their encoded products 
25 in vivo) . 

Further descriptions and sources of Therapeutics of 
the inventions are found in Sections 5.1 through 5.7 herein. 

Molecules which retain, or alternatively inhibit, a 
desired Serrate property, e.g., binding to Notch, binding to 
30 an intracellular ligand, can be used therapeutically as 

inducers, or inhibitors, respectively, of such property and 
its physiological correlates. m a specific embodiment, a 
peptide (e.g., in the range of 10-50 or 15-25 amino acids; 
and particularly of about 10, 15, 20 or 25 amino acids) 
35 containing the sequence of a portion of a vertebrate Serrate 
which binds to Notch is used to antagonize Notch function. 
In a specific embodiment, such an Antagonist Therapeutic is 
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used to treat or prevent human or other malignancies 
associated with increased Notch expression (e.g., cervical 
cancer, colon cancer, breast cancer, squamous adenocarcimas 
(see infra)). Derivatives or analogs of Serrate can be 
5 tested for the desired activity by procedures known in the 
art, including but not limited to the assays described in the 
examples infra. For example, molecules comprising vertebrate 
Serrate fragments which bind to Notch EGF-repeats (ELR) 11 
and 12 and which are smaller than a DSL domain, can be 
10 obtained and selected by expressing deletion mutants and 

assaying for binding of the expressed product to Notch by any 
of the several methods (e.g., in vitro cell aggregation 
assays, interaction trap system), some of which are described 
in the Examples Sections infra. m one specific embodiment, 
15 peptide libraries can be screened to select a peptide with 
the desired activity; such screening can be carried out by 
assaying, e.g., for binding to Notch or a molecule containing 
the Notch ELR 11 and 12 repeats. 

The Agonist and Antagonist Therapeutics of the 
20 invention have therapeutic utility for disorders of cell 
fate- The Agonist Therapeutics are administered 
therapeutically (including prophylactically) : (i) in diseases 
or disorders involving an absence or decreased (relative to 
normal, or desired) levels of Notch or Serrate function, for 
25 example, in patients where Notch or Serrate protein is 
lacking, genetically defective, biologically inactive or 
underactive, or underexpressed; and (2) in diseases or 
disorders wherein in vitro (or in vivo) assays (see infra) 
indicate the utility of Serrate agonist administration. The 
30 absence or decreased levels in Notch or Serrate function can 
be readily detected, e.g., by obtaining a patient tissue 
sample (e.g., from biopsy tissue) and assaying it in vitro 
for protein levels, structure and/or activity of the 
expressed Notch or Serrate protein. Many methods standard in 
35 the art can be thus employed, including but not limited to 
immunoassays to detect and/or visualize Notch or Serrate 
protein (e.g., Western blot, immunoprecipitat ion followed by 
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sodium dodecyl sulfate polyacry lamide gel electrophoresis, 
immunocytochemistry, etc.) and/or hybridization assays to 
detect Notch or Serrate expression by detecting and/or 
visualizing respectively Notch or Serrate mRNA (e.g., 
5 Northern assays, dot blots, in situ hybridization, etc.) 

In vitro assays which can be used to determine 
whether administration of a specific Agonist Therapeutic or 
Antagonist Therapeutic is indicated, include in vitro cell 
culture assays in which a patient tissue sample is grown in 

10 culture, and exposed to or otherwise administered a 

Therapeutic, and the effect of such Therapeutic upon the 
tissue sample is observed. In one embodiment, where the 
patient has a malignancy, a sample of cells from such 
malignancy is plated out or grown in culture, and the cells 

15 are then exposed to a Therapeutic. A Therapeutic which 

inhibits survival or growth of the malignant cells (e.g., by 
promoting terminal differentiation) is selected for 
therapeutic use in vivo. Many assays standard in the art can 
be used to assess such survival and/or growth; for example, 

20 cell proliferation can be assayed by measuring 3 H-thymidine 
incorporation, by direct cell count, by detecting changes in 
transcriptional activity of known genes such as proto- 
oncogenes (e.g., fos, myc) or cell cycle markers; cell 
viability can be assessed by trypan blue staining, 

25 differentiation can be assessed visually based on changes in 
morphology, etc. In a specific aspect, the malignant cell 
cultures are separately exposed to (1) an Agonist 
Therapeutic, and (2) an Antagonist Therapeutic; the result of 
the assay can indicate which type of Therapeutic has 

30 therapeutic efficacy. 

In another embodiment, a Therapeutic is indicated 
for use which exhibits the desired effect, inhibition or 
promotion of cell growth, upon a patient cell sample from 
tissue having or suspected of having a hyper- or 

35 hypoprolif erative disorder, respectively. Such hyper- or 
hypoprolif erative disorders include but are not limited to 
those described in Sections 5.8.1 through 5.8.3 infra. 
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In another specific embodiment, a Therapeutic is 
indicated for use in treating nerve injury or a nervous 
system degenerative disorder (see Section 5.8.2) which 
exhibits in vitro promotion of nerve regeneration/neurite 
5 extension from nerve cells of the affected patient type. 

In addition, administration of an Antagonist 
Therapeutic of the invention is also indicated in diseases or 
disorders determined or known to involve a Notch or Serrate 
dominant activated phenotype ("gain of function" mutations.) 
10 Administration of an Agonist Therapeutic is indicated in 

diseases or disorders determined or known to involve a Notch 
or Serrate dominant negative phenotype ("loss of function" 
mutations) . The functions of various structural domains of 
the Notch protein have been investigated in vivo, by 
15 ectopically expressing a series of Drosophila Notch deletion 
mutants under the hsp7 0 heat-shock promoter, as well as eye- 
specific promoters (see Rebay et al., 1993, Cell 74:319-329). 
Two classes of dominant phenotypes were observed, one 
suggestive of Notch loss-of function mutations and the other 
20 of Notch gain-of -function mutations. Dominant "activated" 
phenotypes resulted from overexpression of a protein lacking 
most extracellular sequences, while dominant "negative" 
phenotypes resulted from overexpression of a protein lacking 
most intracellular sequences. The results indicated that 
25 Notch functions as a receptor whose extracellular domain 
mediates ligand-binding, resulting in the transmission of 
developmental signals by the cytoplasmic domain. We have 
shown that Serrate binds to the Notch ELR 11 and 12 (see PCT 
Publication WO 93/12141). 
30 m various specific embodiments, in vitro assays 

can be carried out with representative cells of cell types 
involved in a patient's disorder, to determine if a 
Therapeutic has a desired effect upon such cell types. 

in another embodiment, cells of a patient tissue 
35 sample suspected of being pre-neoplastic are similarly plated 
out or grown in vitro, and exposed to a Therapeutic. The 
Therapeutic which results in a cell phenotype that is more 
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normal (i.e., less representative of a pre-neoplastic state, 
neoplastic state, malignant state, or transformed phenotype) 
is selected for therapeutic use. Many assays standard in the 
art can be used to assess whether a pre-neoplastic state, 
S neoplastic state, or a transformed or malignant phenotype, is 
present. For example, characteristics associated with a 
transformed phenotype (a set of in vitro characteristics 
associated with a tumorigenic ability in vivo) include a more 
rounded cell morphology, looser substratum attachment, loss 
10 of contact inhibition, loss of anchorage dependence, release 
of proteases such as plasminogen activator, increased sugar 
transport, decreased serum requirement, expression of fetal 
antigens, disappearance of the 250,000 dalton surface 
protein, etc. (see Luria et al., 1978, General Virology, 3d 
15 Ed., John Wiley & Sons, New York pp. 436-446). 

In other specific embodiments, the in vitro assays 
described supra can be carried out using a cell line, rather 
than a cell sample derived from the specific patient to be 
treated, in which the cell line is derived from or displays 
20 characteristic (s) associated with the malignant, neoplastic 
or pre-neoplastic disorder desired to be treated or 
prevented, or is derived from the neural or other cell type 
upon which an effect is desired, according to the present 
invention. 

25 The Antagonist Therapeutics are administered 

therapeutically (including prophylactically) : (D in diseases 
or disorders involving increased (relative to normal, or 
desired) levels of Notch or Serrate function, for example, 
where the Notch or Serrate protein is overexpressed or 

, j_ ^ioacofi or disorders wherein in vitro 

30 overactive; and (2) in diseases or ai&uxu«=i. 

(or in vivo) assays indicate the utility of Serrate 

antagonist administration. The increased levels of Notch or 

Serrate function can be readily detected by methods such as 

those described above, by quantifying protein and/or RNA. In 

35 vitro assays with cells of patient tissue sample or the 

appropriate cell line or cell type, to determine therapeutic 

utility, can be carried out as described above. 
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5.8.1. MALIGNANCIES 
Malignant and pre-neoplastic conditions which can 
be tested as described supra for efficacy of intervention 
with Antagonist or Agonist Therapeutics, and which can be 
5 treated upon thus observing an indication of therapeutic 

utility, include but are not limited to those described below 
in Sections 5.8.1 and 5.9.1. 

Malignancies and related disorders, cells of which 
type can be tested in vitro (and/or in vivo), and upon 
10 observing the appropriate assay result, treated according to 
the present invention, include but are not limited to those 
listed in Table 1 (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., 
Philadelphia) : 

15 



TABLE 1 

MALIGNANCIES AND RELAT ED DISORDERS 

Leukemia 

acute leukemia 

acute lymphocytic leukemia 
acute myelocytic leukemia 
myeloblastic 
promyelocytic 
my e 1 omonocy t i c 
monocytic 
erythroleukemia 
chronic leukemia 

chronic myelocytic (granulocytic) leukemia 
chronic lymphocytic leukemia 
Polycythemia vera 
Lymphoma 

Hodgkin's disease 
non-Hodgkin ' s disease 
Multiple myeloma 

Waldenstrdm* s macroglobulinemia 
Heavy chain disease 
Solid tumors 

sarcomas and carcinomas 

fibrosarcoma 

myxosarcoma 

liposarcoma 

chondrosarcoma 

osteogenic sarcoma 

chordoma 
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angiosarcoma 

endotheliosarcoma 

lymphangiosarcoma 

lymphangioendotheliosarcoma 
synovioma 

mesothelioma 

Ewing's tumor 

leiomyosarcoma 

rhabdomyosarcoma 

colon carcinoma 
pancreatic cancer 
breast cancer 
ovarian cancer 
prostate cancer 
squamous cell carcinoma 
basal cell carcinoma 
adenocarcinoma 
sweat gland carcinoma 
sebaceous gland carcinoma 
papillary carcinoma 
papillary adenocarcinomas 
cystadenocarcinoma 
medullary carcinoma 
bronchogenic carcinoma 
renal cell carcinoma 
hepatoma 

bile duct carcinoma 

choriocarcinoma 

seminoma 

embryonal carcinoma 

Wilms' tumor 

cervical cancer 

testicular tumor 

lung carcinoma 

small cell lung carcinoma 

bladder carcinoma 

epithelial carcinoma 

glioma 

astrocytoma 

medul loblastoma 

craniopharyngioma 

ependymoma 

pinealoma 

hemangioblastoma 

acoustic neuroma 

oligodendroglioma 

menangioma 

melanoma 

neuroblastoma 

retinoblastoma 
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In specific embodiments, malignancy or 
dysproliferative changes (such as metaplasias and dysplasias) 
are treated or prevented in epithelial tissues such as those 
in the cervix, esophagus, and lung. 
5 Malignancies of the colon and cervix exhibit 

increased expression of human Notch relative to such non- 
malignant tissue (see PCT Publication no. WO 94/07474 
published April 14, 1994, incorporated by reference herein in 
its entirety). Thus, in specific embodiments, malignancies 
10 or premalignant changes of the colon or cervix are treated or 
prevented by administering an effective amount of an 
Antagonist Therapeutic, e.g., a Serrate derivative, that 
antagonizes Notch function. The presence of increased Notch 
expression in colon, and cervical cancer suggests that many 
15 more cancerous and hyperprolif erative conditions exhibit 
upregulated Notch. Thus, in specific embodiments, various 
cancers, e.g., breast cancer, squamous adenocarcinoma, 
seminoma, melanoma, and lung cancer, and premalignant changes 
therein, as well as other hyperprolif erative disorders, can 
20 be treated or prevented by administration of an Antagonist 
Therapeutic that antagonizes Notch function. 

5.8.2. NERVOUS SYSTEM DTSOPnFPQ 
Nervous system disorders, involving cell types 
25 which can be tested as described supra for efficacy of 
intervention with Antagonist or Agonist Therapeutics, and 
which can be treated upon thus observing an indication of 
therapeutic utility, include but are not limited to nervous 
system injuries, and diseases or disorders which result in 
30 either a disconnection of axons, a diminution or degeneration 
of neurons, or demyelination . Nervous system lesions which 
may be treated in a patient (including human and non-human 
mammalian patients) according to the invention include but 
are not limited to the following lesions of either the 
35 central (including spinal cord, brain) or peripheral nervous 
systems : 
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(i) traumatic lesions, including lesions caused by 
physical injury or associated with surgery, 
for example, lesions which sever a portion of 
the nervous system, or compression injuries; 
5 (ii) ischemic lesions, in which a lack of oxygen in 

a portion of the nervous system results in 
neuronal injury or death, including cerebral 
infarction or ischemia, or spinal cord 
infarction or ischemia; 

10 (iii) malignant lesions, in which a portion of the 

nervous system is destroyed or injured by 
malignant tissue which is either a nervous 
system associated malignancy or a malignancy 
derived from non-nervous system tissue ; 

15 (iv) infectious lesions, in which a portion of the 

nervous system is destroyed or injured as a 
result of infection, for example, by an 
abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or 

20 herpes simplex virus or with Lyme disease, 

tuberculosis, syphilis; 
(v) degenerative lesions, in which a portion of 

the nervous system is destroyed or injured as 
a result of a degenerative process including 

25 but not limited to degeneration associated 

with Parkinson's disease, Alzheimer's disease, 
Huntington's chorea, or amyotrophic lateral 
sclerosis; 

(vi) lesions associated with nutritional diseases 
30 or disorders, in which a portion of the 

nervous system is destroyed or injured by a 
nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 
deficiency, folic acid deficiency, Wernicke 
35 disease, tobacco-alcohol amblyopia, 

Marchiaf ava-Bignami disease (primary 
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degeneration of the corpus callosum) , and 
alcoholic cerebellar degeneration ; 
(vii) neurological lesions associated with systemic 
diseases including but not limited to diabetes 
5 (diabetic neuropathy, Bell's palsy), systemic 

lupus erythematosus, carcinoma, or 
sarcoidosis; 

(viii) lesions caused by toxic substances including 
alcohol, lead, or particular neurotoxins; and 
10 ( ix > demyelinated lesions in which a portion of the 

nervous system is destroyed or injured by a 
demyelinating disease including but not 
limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, 
15 transverse myelopathy or various etiologies, 

progressive multifocal leukoencephalopathy , 
and central pontine myelinolysis . 
Therapeutics which are useful according to the 
invention for treatment of a nervous system disorder may be 
20 selected by testing for biological activity in promoting the 
survival or differentiation of neurons (see also Section 
5.8). For example, and not by way of limitation, 
Therapeutics which elicit any of the following effects may be 
useful according to the invention: 
25 U) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or 

in vivo; 

(iii) increased production of a neuron-associated 

molecule in culture or in vivo, e.g., choline 
30 acetyltransferase or acetylcholinesterase with 

respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in 
vivo. 

Such effects may be measured by any method known in the art. 
35 In preferred, non-limiting embodiments, increased survival of 
neurons may be measured by the method set forth in Arakawa et 
al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of 
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neurons may be detected by methods set forth in Pestronk et 
al. (1980, Exp. Neurol. 70:65-82) or Brown et al. (1981, Ann. 
Rev. Neurosci. 4:17-42); increased production of neuron- 
associated molecules may be measured by bioassay, enzymatic 
5 assay, antibody binding, Northern blot assay, etc., depending 
on the molecule to be measured; and motor neuron dysfunction 
may be measured by assessing the physical manifestation of 
motor neuron disorder, e.g., weakness, motor neuron 
conduction velocity, or functional disability. 

10 In a specific embodiments, motor neuron disorders 

that may be treated according to the invention include but 
are not limited to disorders such as infarction, infection, 
exposure to toxin, trauma, surgical damage, degenerative 
disease or malignancy that may affect motor neurons as well 

15 as other components of the nervous system, as well as 

disorders that selectively affect neurons such as amyotrophic 
lateral sclerosis, and including but not limited to 
progressive spinal muscular atrophy, progressive bulbar 
palsy, primary lateral sclerosis, infantile and juvenile 
20 muscular atrophy, progressive bulbar paralysis of childhood 
(Fazio-Londe syndrome) , poliomyelitis and the post polio 
syndrome, and Hereditary Motorsensory Neuropathy (Charcot- 
Marie-Tooth Disease) . 

25 5.8.3. TISSUE R EPAIR AND REGENERATION 

In another embodiment of the invention, a 
Therapeutic of the invention is used for promotion of tissue 
regeneration and repair, including but not limited to 
treatment of benign dysprolif erative disorders. Specific 

30 embodiments are directed to treatment of cirrhosis of the 
liver (a condition in which scarring has overtaken normal 
liver regeneration processes), treatment of keloid 
(hypertrophic scar) formation (disfiguring of the skin in 
which the scarring process interferes with normal renewal) , 

35 psoriasis (a common skin condition characterized by excessive 
proliferation of the skin and delay in proper cell fate 
determination), and baldness (a condition in which terminally 
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differentiated hair follicles (a tissue rich in Notch) fail 
to function properly) . In another embodiment, a Therapeutic 
of the invention is used to treat d generative or traumatic 
disorders of the sensory epithelium of the inner ear. 

5 

5.9. PROPHYLACTIC USES 
5.9.1. MALIGNANCIES 
The Therapeutics of the invention can be 
administered to prevent progression to a neoplastic or 

10 malignant state, including but not limited to those disorders 
listed in Table 1. Such administration is indicated where 
the Therapeutic is shown in assays, as described supra, to 
have utility for treatment or prevention of such disorder. 
Such prophylactic use is indicated in conditions known or 

15 suspected of preceding progression to neoplasia or cancer, in 
particular, where non-neoplastic cell growth consisting of 
hyperplasia, metaplasia, or most particularly, dysplasia has 
occurred (for review of such abnormal growth conditions, see 
Robbins and Angell, 1976, Basic Pathology, 2d Ed., W.B. 

20 Saunders Co., Philadelphia, pp. 68-79.) Hyperplasia is a 
form of controlled cell proliferation involving an increase 
in cell number in a tissue or organ, without significant 
alteration in structure or function. As but one example, 
endometrial hyperplasia often precedes endometrial cancer. 

25 Metaplasia is a form of controlled cell growth in which one 
type of adult or fully differentiated cell substitutes for 
another type of adult cell. Metaplasia can occur in 
epithelial or connective tissue cells. Atypical metaplasia 
involves a somewhat disorderly metaplastic epithelium. 

30 Dysplasia is frequently a forerunner of cancer, and is found 
mainly in the epithelia; it is the most disorderly form of 
non-neoplastic cell growth, involving a loss in individual 
cell uniformity and in the architectural orientation of 
cells. Dysplastic cells often have abnormally large, deeply 

35 stained nuclei, and exhibit pleomorphism. Dysplasia 
characteristically occurs where there exists chronic 
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irritation or inf Ration, and is often found in the cervix, 
respiratory passages, oral cavity, and gall bladder ' 

Alternatively or in addition to the presence of 
abnormal cell growth characterized as hyperplasia, 
5 metaplasia, or dysplasia, the presence of one or -or. 

characteristics of a transformed phenotype or of a ma Xx-n.nl: 
phenotype, displayed in vivo or displayed an -toby a cell 

u.ir- * «=•» inaicate the desira : iiity : f ic of 

prophylactic/therapeutic administration of a Therapeutxc of 
XO the invention. As mentioned supra, such character^ of 
transformed phenotype include morphology changes looser 
substratum attachment, loss of contact xnhxbitxon, loss 
anchorage dependence, protease release, mcr ease s a r 
transport, decreased serum requirement, expression of fetal 

~~ the 250,000 dalton cell surface 
XS antigens, disappearance of the 250, cnaracter istics 
protein, etc. (see also id., at pp. e« * 
associated with a transformed or malignant 

in a specific embodiment, leukoplakia, a benxgn 
appearing hyperplastic or dysplastic lesion of the 
20 epithelium, or Bowen's disease, a "lit or 

neoplastic lesions indicative of the desxrabxlxty of 

prophylactic fibrocystic disea se (cystic 

hyP er P lasia. mamm a-.-splasia 
25 epithelial hyperplasia)) is xndxcatxve or 

prophylactic intervention. exnibit s one 

In other embodiments, a patient 
or more of the following predisposing factors for «^«.ncy 
Is treated by administration of an effective amount of a 
30 therapeutic: a chromosomal translocation associated with a 
lalianancy (e.g.. the Philadelphia chromosome for chronxc 
lye logenous leukemia, tU^S) for follicular 
myeiogeno rardner ' s syndrome (possible 

forerunners of colon cancer), o y degree 
3S possible forerunner of multiple myeloma) and a fxrs 9 
kinship with persons having a cancer or P"«^° U 
showing a Mendelian (genetic) inheritance pattern (e.g.. 
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familial polyposis of the colon, Gardner's syndrome, 
hereditary exostosis, polyendocrine adenomatosis, medullary 
thyroid carcinoma with amyloid production and 
pheochromocytoma, Peutz-Jeghers syndrome, neurofibromatosis 
5 of Von Recklinghausen, retinoblastoma, carotid body tumor, 
cutaneous melanocarcinoma , intraocular melanocarcinoma , 
xeroderma pigmentosum, ataxia telangiectasia, Chediak-Higashi 
syndrome, albinism, Fanconi's aplastic anemia, and Bloom's 
syndrome; see Robbins and Angel 1, 1976, Basic Pathology , 2d 
10 Ed., W.B. Saunders Co., Philadelphia, pp. 112-113) etc.) 

In another specific embodiment, an Antagonist 
Therapeutic of the invention is administered to a human 
patient to prevent progression to breast, colon, or cervical 
cancer. 

15 

5.9.2. OTHER DISORDERS 
In other embodiments, a Therapeutic of the 
invention can be administered to prevent a nervous system 
disorder described in Section 5.8.2, or other disorder (e.g., 
20 liver cirrhosis, psoriasis, keloids, baldness) described in 
Section 5.8.3. 



5.10. DEMONSTRATION OF THERAPEUTIC 
OR PROPHYLACTIC UTILITY 

The Therapeutics of the invention can be tested in 

vivo for the desired therapeutic or prophylactic activity. 

For example, such compounds can be tested in suitable animal 

model systems prior to testing in humans, including but not 

limited to rats, mice, chicken, cows, monkeys, rabbits, etc. 

For in vivo testing, prior to administration to humans, any 

animal model system known in the art may be used. 

5.11. ANTI SENSE REGULATION OF SER RATE EXPRESSION 

The present invention provides the therapeutic or 
prophylactic use of nucleic acids of at least six or of at 
least ten nucleotides that are antisense to a gene or cDNA 
encoding a vertebrate Serrate or a portion thereof. 
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Ant i sense" as used herein refers to a nucleic acid capable 
of hybridizing to a portion of a vertebrate Serrate RNA 
(preferably mRNA) by virtue of some sequence complementarity. 
Such antisense nucleic acids have utility as Antagonist 
5 Therapeutics of the invention, and can be used in the 

treatment or prevention of disorders as described supra in 
Section 5.8 and its subsections. 

The antisense nucleic acids of the invention can be 
oligonucleotides that are double-stranded or single-stranded, 
10 RNA or DNA or a modification or derivative thereof, which can 
be directly administered to a cell, or which can be produced 
intracellularly by transcription of exogenous, introduced 
sequences. 

In a specific embodiment, the Serrate antisense 

15 nucleic acids provided by the instant invention can be used 
for the treatment of tumors or other disorders, the cells of 
which tumor type or disorder can be demonstrated (in vitro or 
in vivo) to express a Serrate gene or a Notch gene. Such 
demonstration can be by detection of RNA or of protein. 

20 The invention further provides pharmaceutical 

compositions comprising an effective amount of the Serrate 
antisense nucleic acids of the invention in a 
pharmaceutical^ acceptable carrier, as described infra in 
Section 5.12. Methods for treatment and prevention of 

25 disorders (such as those described in Sections 5.8 and 5.9) 
comprising administering the pharmaceutical compositions of 
the invention are also provided. 

In another embodiment, the invention is directed to 
methods for inhibiting the expression of a Serrate nucleic 

30 acid sequence in a prokaryotic or eukaryotic cell comprising 
providing the cell with an effective amount of a composition 
comprising an antisense vertebrate Serrate nucleic acid of 
the invention. 

Serrate antisense nucleic acids and their uses are 

35 described in detail below. 
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5.11.1. VERTEBRATE SERRATE ANTISENSE NUCLEIC ACIDS 

The vertebrate Serrate antisense nucleic acids are 
of at least six nucleotides and are preferably 
oligonucleotides (ranging preferably from 10 to about 50 
5 oligonucleotides) . In specific aspects, the oligonucleotide 
contains at least 10 nucleotides, at least 15 nucleotides, at 
least 100 nucleotides, or at least 200 nucleotides antisense 
to a Serrate gene. The oligonucleotides can be DNA or RNA or 
chimeric mixtures or derivatives or modified versions 
10 thereof, single-stranded or double-stranded. The 

oligonucleotide can be modified at the base moiety, sugar 
moiety, or phosphate backbone. The oligonucleotide may 
include other appending groups such as peptides, or agents 
facilitating transport across the cell membrane (see, e.g., 
15 Letsinger et al., 1989, Proc. Natl. Acad. Sci . U.S.A. 

86:6553-6556; Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 
84:648-652; PCT Publication No. WO 88/09810, published 
December 15, 1988) or blood-brain barrier (see, e.g., PCT 
Publication No. WO 89/10134, published April 25, 1988), 
20 hybridization-triggered cleavage agents (see, e.g., Krol et 
al., 1988, BioTechniques 6:958-976) or intercalating agents 
(see, e.g., Zon, 1988, Pharm. Res. 5:539-549). 

In a preferred aspect of the invention, a 
vertebrate Serrate antisense oligonucleotide is provided, 
25 preferably of single-stranded DNA. In a most preferred 
aspect, such an oligonucleotide comprises a sequence 
antisense to the sequence encoding an SH3 binding domain or a 
Notch-binding domain of Serrate, most preferably, of a human 
Serrate homolog. The oligonucleotide may be modified at any 
30 position on its structure with substituents generally known 
in the art. 

The Serrate antisense oligonucleotide may comprise 
at least one modified base moiety which is selected from the 
group including but not limited to 5-f luorouracil , 
35 5-bromouracil, 5-chlorouracil , 5-iodouracil , hypoxanthine , 
xantine, 4-acetylcytosine , 5- (carboxyhydroxy lmethyl ) uracil, 
5-carboxymethylaminomethyl-2-thiouridine, 
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S-carboxymethylaminomethyluracil, dihydrouraci 1 , beta-D- 
galactosylqueosine , inosine , N6-isopentenyladenine , 

1- methylguanine, 1-methylinosine, 2 , 2-dimethylguanine, 

2- methyladenine, 2-methylguanine , 3-methylcytosine , 
5 5-methylcytosine, N6-adenine, 7-methylguanine, 

5-methylaminomethyluracil , 5-methoxyaminomethyl-2-thiouracil , 
beta-D-mannosylqueosine, 5 ' -methoxycarboxymethyluracil , 
5-methoxyuracil , 2-methy lthio-N6-isopentenyladenine , 
uracil-5-oxyacetic acid (v) , wybutoxosine, pseudouracil , 

10 queosine, 2-thiocytosine , 5-methyl-2-thiouracil , 

2-thiouracil , 4-thiouracil , 5-methyluracil , uracil- 
5-oxyacetic acid methylester , uracil-5-oxyacetic acid (v) , 
5-methyl-2-thiouracil, 3- ( 3-amino-3-N-2-carboxypropyl ) 
uracil, (acp3)w, and 2 , 6-diaminopurine . 

" in another embodiment, the oligonucleotide 

comprises at least one modified sugar moiety selected from 
the group including but not limited to arabinose, 
2-f luoroarabinose, xylulose, and hexose. 

In yet another embodiment, the oligonucleotide 

20 comprises at least one modified phosphate backbone selected 
from the group consisting of a phosphorothioate , a 
phosphorodithioate , a phosphoramidothioate , a 

phosphoramidate, a phosphordiamidate , a methy lphosphonate , an 
alkyl phosphotriester , and a formacetal or analog thereof. 

25 in yet another embodiment, the oligonucleotide is 

an a-anomeric oligonucleotide. An a-anomeric oligonucleotide 
forms specific double-stranded hybrids with complementary RNA 
in which, contrary to the usual £-units, the strands run 
parallel to each other (Gautier et al., 1987, Nucl. Acids 

30 Res. 15:6625-6641). 

The oligonucleotide may be conjugated to another 
molecule, e.g., a peptide, hybridization triggered cross- 
linking agent, transport agent, hybridization-triggered 
cleavage agent, etc. 

35 Oligonucleotides of the invention may be 

synthesized by standard methods known in the art, e.g. by use 
of an automated DNA synthesizer (such as are commercially 
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available from Biosearch, Applied Biosystems, etc.)* As 
examples, phosphorothioate oligonucleotides may be 
synthesized by the method of Stein et al. (1988, Nucl. Acids 
Res. 16:3209), methylphosphonate oligonucleotides can be 
5 prepared by use of controlled pore glass polymer supports 
(Sarin et al., 1988, Proc. Natl. Acad. Sci. U.S.A. 85:7448- 
7451), etc. 

In a specific embodiment, the Serrate antisense 
oligonucleotide comprises catalytic RNA, or a ribozyme (see, 

10 e.g., PCT International Publication ^WO 90/11364, published 
October 4, 1990; Sarver et al., 1990, Science 247:1222-1225). 
In another embodiment, the oligonucleotide is a 2'-0- 
methylribonucleotide (Inoue et al . , 1987, Nucl. Acids Res. 
15:6131-6148), or a chimeric RNA-DNA analogue (Inoue et al . , 

15 1987, FEBS Lett. 215:327-330). 

In an alternative embodiment, the Serrate antisense 
nucleic acid of the invention is produced intracellularly by 
transcription from an exogenous sequence. For example, a 
vector can be introduced in vivo such that it is taken up by 

20 a cell, within which cell the vector or a portion thereof is 
transcribed, producing an antisense nucleic acid (RNA) of the 
invention. Such a vector would contain a sequence encoding 
the Serrate antisense nucleic acid. Such a vector can remain 
episomal or become chromosomal ly integrated, as long as it 

25 can be transcribed to produce the desired antisense RNA. 

Such vectors can be constructed by recombinant DNA technology 
methods standard in the art. Vectors can be plasmid, viral, 
or others known in the art, used for replication and 
expression in mammalian cells. Expression of the sequence 

30 encoding the Serrate antisense RNA can be by any promoter 
known in the art to act in mammalian, preferably human, 
cells. Such promoters can be inducible or constitutive. 
Such promoters include but are not limited to: the SV4 0 early 
promoter region (Bernoist and Chambon, 1981, Nature 290:304- 

35 310), the promoter contained in the 3' long terminal repeat 
of Rous sarcoma virus (Yamamoto et al., 1980, Cell 22:787- 
797), the herpes thymidine kinase promoter (Wagner et al., 
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1981, Proc. Natl. Acad. Sci. U.S.A. 78:1441-1445), the 
regulatory sequences of the metallothionein gene (Brinster et 
al., 1982, Nature 296:39-42), etc. 

The antisense nucleic acids of the invention 
5 comprise a sequence complementary to at least a portion of an 
RNA transcript specific to a vertebrate Serrate gene, 
preferably a human Serrate gene. However, absolute 
complementarity, although preferred, is not required. A 
sequence "complementary to at least a portion of an RNA," as 

10 referred to herein, means a sequence having sufficient 

complementarity to be able to hybridize with the RNA, forming 
a stable duplex; in the case of double-stranded Serrate 
antisense nucleic acids, a single strand of the duplex DNA 
may thus be tested, or triplex formation may be assayed. The 

15 ability to hybridize will depend on both the degree of 

complementarity and the length of the antisense nucleic acid. 
Generally, the longer the hybridizing nucleic acid, the more 
base mismatches with a Serrate RNA it may contain and still 
form a stable duplex (or triplex, as the case may be) . One 

20 skilled in the art can ascertain a tolerable degree of 
mismatch by use of standard procedures to determine the 
melting point of the hybridized complex. 

5.11.2. THERAPEUTIC UTILITY OF VERTEBRATE 
SERRATE ANTISENSE NUCLEIC ACIDS 

25 

The vertebrate Serrate antisense nucleic acids can 
be used to treat (or prevent) malignancies or other 
disorders, of a cell type which has been shown to express 
Serrate or Notch. In specific embodiments, the malignancy is 

3Q cervical, breast, or colon cancer, or squamous 

adenocarcinoma. Malignant, neoplastic, and pre-neoplastic 
cells which can be tested for such expression include but are 
not limited to those described supra in Sections 5.8.1 and 
5.9.1. In a preferred embodiment, a single-stranded DNA 

35 antisense Serrate oligonucleotide is used. 

Malignant (particularly, tumor) cell types which 
express Serrate or Notch RNA can be identified by various 
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methods known in the art. Such methods include but are not 
limited to hybridization with a Serrate or Notch-specific 
nucleic acid (e.g. by Northern hybridization, dot blot 
hybridization, in situ hybridization) , observing the ability 
5 of RNA from the cell type to be translated in vitro into 
Notch or Serrate, immunoassay, etc. In a preferred aspect, 
primary tumor tissue from a patient can be assayed for Notch 
or Serrate expression prior to treatment, e.g., by 
immunocytochemistry or in situ hybridization. 

10 Pharmaceutical compositions of the invention (see 

Section 5.12), comprising an effective amount of a vertebrate 
Serrate antisense nucleic acid in a pharmaceutical^ 
acceptable carrier, can be administered to a patient having a 
malignancy which is of a type that expresses Notch or Serrate 

15 RNA or protein. 

The amount of Serrate antisense nucleic acid which 
will be effective in the treatment of a particular disorder 
or condition will depend on the nature of the disorder or 
condition, and can be determined by standard clinical 

20 techniques. Where possible, it is desirable to determine the 
antisense cytotoxicity of the tumor type to be treated in 
vitro, and then in useful animal model systems prior to 
testing and use in humans. 

In a specific embodiment, pharmaceutical 

25 compositions comprising vertebrate Serrate antisense nucleic 
acids are administered via liposomes, microparticles , or 
microcapsules. In various embodiments of the invention, it 
may be useful to use such compositions to achieve sustained 
release of the Serrate antisense nucleic acids. In a 

30 specific embodiment, it may be desirable to utilize liposomes 
targeted via antibodies to specific identifiable tumor 
antigens (Leonetti et al., 1990, Proc. Natl. Acad. Sci. 
U.S.A. 87:2448-2451; Fenneisen et al., 1990, J. Biol. Chem. 
265:16337-16342) . 
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5- 12 . THERAPEUTIC/ PROPHYLACTIC 

ADMINISTRATION AND COMPOSITIONS 

The invention provides methods of treatment (and 
prophylaxis) by administration to a subject of an effective 
amount of a Therapeutic of the invention. In a preferred 
aspect, the Therapeutic is substantially purified. The 
subject is preferably an animal, including but not limited to 
animals such as cows, pigs, chickens, etc., and is preferably 
a mammal, and most preferably human. 

Various delivery systems are known and can be used 
to administer a Therapeutic of the invention, e.g., 
encapsulation in liposomes, microparticles , microcapsules, 
expression by recombinant cells, receptor-mediated 
endocytosis (see, e.g., Wu and Wu, 1987, J. Biol. Chem. 
262:4429-4432), construction of a Therapeutic nucleic acid as 
part of a retroviral or other vector, etc. Methods of 
introduction include but are not limited to intradermal, 
intramuscular, intraperitoneal, intravenous, subcutaneous, 
intranasal, epidural, and oral routes. The compounds may be 
administered by any convenient route, for example by infusion 
or bolus injection, by absorption through epithelial or 
mucocutaneous linings (e.g., oral mucosa, rectal and 
intestinal mucosa, etc.) and may be administered together 
with other biologically active agents. Administration can be 
systemic or local. In addition, it may be desirable to 
introduce the pharmaceutical compositions of the invention 
into the central nervous system by any suitable route, 
including intraventricular and intrathecal injection; 
intraventricular injection may be facilitated by an 
intraventricular catheter, for example, attached to a 
reservoir, such as an Ommaya reservoir. Pulmonary 
administration can also be employed, e.g., by use of an 
inhaler or nebulizer, and formulation with an aerosolizing 
agent. 

In a specific embodiment, it may be desirable to 
administer the pharmaceutical compositions of the invention 
locally to the area in need of treatment; this may be 
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achieved by, for example, and not by way of limitation, local 
infusion during surgery, topical application, e.g., in 
conjunction with a wound dressing after surgery, by 
injection, by means of a catheter, by means of a suppository, 
5 or by means of an implant, said implant being of a porous, 
non-porous, or gelatinous material, including membranes, such 
as sialastic membranes, or fibers. In one embodiment, 
administration can be by direct injection at the site (or 
former site) of a malignant tumor or neoplastic or pre- 
10 neoplastic tissue. 

In another embodiment, the Therapeutic can be 
delivered in a vesicle, in particular a liposome (see Langer, 
Science 249:1527-1533 (1990); Treat et al., in Liposomes in 
the Therapy of Infectious Disease and Cancer, Lopez-Berestein 
15 and Fidler (eds.), Liss, New York, pp. 353-365 (1989); 
Lopez-Berestein, ibid., pp. 317-327; see generally ibid.) 

In yet another embodiment, the Therapeutic can be 
delivered in a controlled release system. In one embodiment, 
a pump may be used (see Langer, supra; Sefton, CRC Crit. Ref . 
20 Biomed. Eng. 14:201 (1987); Buchwald et al., Surgery 88:507 
(1980); Saudek et al., N. Engl. J. Med. 321:574 (1989)). In 
another embodiment, polymeric materials can be used (see 
Medical Applications of Controlled Release, Langer and Wise 
(eds.)/ CRC Pres., Boca Raton, Florida (1974); Controlled 
25 Drug Bioavailability, Drug Product Design and Performance, 
Smolen and Ball (eds.), Wiley, New York (1984); Ranger and 
Peppas, J. Macromol. Sci. Rev. Macromol. Chem. 23:61 (1983); 
see also Levy et al., Science 228:190 (1985); During et al., 
Ann. Neurol. 25:351 (1989); Howard et al., J. Neurosurg. 
30 71:105 (1989)). In yet another embodiment, a controlled 

release system can be placed in proximity of the therapeutic 
target, i.e., the brain, thus requiring only a fraction of 
the systemic dose (see, e.g., Goodson, in Medical 
Applications of Controlled Release, supra, vol. 2, pp. 

35 115-138 (1984) ) . 

Other controlled release systems are discussed in 
the review by Langer (Science 249:1527-1533 (1990)). 

- 58 - 



PNSDOCID: <W0 96276 1 0A i J 



WO 96/27610 



PCT/US96/03172 



in a specific embodiment where the Therapeutic is a 
nucleic acid encoding a protein Therapeutic, the nucleic acid 
can be administered in vivo to promote expression of its 
encoded protein, by constructing it as part of an appropriate 
5 nucleic acid expression vector and administering it so that 
it becomes intracellular, e.g., by use of a retroviral vector 
(see U.S. Patent No. 4,980,286), or by direct injection, or 
by use of microparticle bombardment (e.g., a gene gun; 
Biolistic. Dupont) , or coating with lipids or cell-surface 
„ receptors or transfecting agents, or by administering it in 
linKage to a homeobox-like peptide which is Known to enter 
the nucleus (see e.g., Joliot et al., 1991. Proc. Hat!. Acad, 
sci. USA 88:1864-1868), etc. Alternatively, a nucleic acid 
Therapeutic can be introduced intracellular^ and 
15 incorporated within host cell DNA for expression, by 
homologous recombination. 

in specific embodiments directed to treatment or 
prevention of particular disorders, preferably the following 

forms of administration are used: 

Preferred Forms of 

administration . . 

Topical 



20 



25 



30 



35 



pi Border 
Cervical cancer 
Gastrointestinal cancer 
Lung cancer 
Leukemia 

Metastatic carcinomas 

Brain cancer 

Liver cirrhosis 

Psoriasis 

Keloids 

Baldness 

Spinal cord injury 
Parkinson's disease 
Motor neuron disease 
Alzheimer's disease 



Oral; intravenous 
Inhaled; intravenous 
Intravenous; extracorporeal 
Intravenous; oral 

Targeted; intravenous ; intrathecal 

Oral; intravenous 

Topical 

Topical 

Topical 

Targeted; intravenous; intrathecal 
Targeted; intravenous; intrathecal 
Targeted; intravenous; intrathecal 
Targeted; intravenous; intrathecal 
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The present invention also provides pharmaceutical 
compositions. Such compositions comprise a therapeutically 
effective amount of a Therapeutic, and a pharmaceutical ly 
acceptable carrier. In a specific embodiment, the term 
5 "pharmaceutical^ acceptable" means approved by a regulatory 
agency of the Federal or a state government or listed in the 
U.S. Pharmacopeia or other generally recognized pharmacopeia 
for use in animals, and more particularly in humans. The 
term "carrier" refers to a diluent, adjuvant, excipient, or 

10 vehicle with which the therapeutic is administered. Such 

pharmaceutical carriers can be sterile liquids, such as water 
and oils, including those of petroleum, animal, vegetable or 
synthetic origin, such as peanut oil, soybean oil, mineral 
oil, sesame oil and the like. Water is a preferred carrier 

15 when the pharmaceutical composition is administered 

intravenously. Saline solutions and aqueous dextrose and 
glycerol solutions can also be employed as liquid carriers, 
particularly for injectable solutions. Suitable 
pharmaceutical excipients include starch, glucose, lactose, 

20 sucrose, gelatin, malt, rice, flour, chalk, silica gel, 
sodium stearate, glycerol monostearate , talc, sodium 
chloride, dried skim milk, glycerol, propylene, glycol, 
water, ethanol and the like. The composition, if desired, 
can also contain minor amounts of wetting or emulsifying 

25 agents, or pH buffering agents. These compositions can take 
the form of solutions, suspensions, emulsion, tablets, pills, 
capsules, powders, sustained-release formulations and the 
like. The composition can be formulated as a suppository, 
with traditional binders and carriers such as triglycerides. 

30 Oral formulation can include standard carriers such as 

pharmaceutical grades of mannitol, lactose, starch, magnesium 
stearate, sodium saccharine, cellulose, magnesium carbonate, 
etc. Examples of suitable pharmaceutical carriers are 
described in "Remington's Pharmaceutical Sciences" by E.W. 

35 Martin. Such compositions will contain a therapeutically 
effective amount of the Therapeutic, preferably in purified 
form, together with a suitable amount of carrier so as to 
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provide the form for proper administration to the patient. 
The formulation should suit the mode of administration. 

In a preferred embodiment, the composition is 
formulated in accordance with routine procedures as a 
5 pharmaceutical composition adapted for intravenous 

administration to human beings. Typically, compositions for 
intravenous administration are solutions in sterile isotonic 
aqueous buffer. Where necessary, the composition may also 
include a solubilizing agent and a local anesthetic such as 

10 lignocaine to ease pain at the site of the injection. 

Generally, the ingredients are supplied either separately or 
mixed together in unit dosage form, for example, as a dry 
lyophilized powder or water free concentrate in a 
hermetically sealed container such as an ampoule or sachette 

15 indicating the quantity of active agent. Where the 

composition is to be administered by infusion, it can be 
dispensed with an infusion bottle containing sterile 
pharmaceutical grade water or saline. Where the composition 
is administered by injection, an ampoule of sterile water for 

2 0 injection or saline can be provided so that the ingredients 
may be mixed prior to administration. 

The Therapeutics of the invention can be formulated 
as neutral or salt forms. Pharmaceutically acceptable salts 
include those formed with free amino groups such as those 

25 derived from hydrochloric, phosphoric, acetic, oxalic, 

tartaric acids, etc., and those formed with free carboxyl 
groups such as those derived from sodium, potassium, 
ammonium, calcium, ferric hydroxides, isopropylamine, 
triethylamine, 2-ethylamino ethanol, histidine, procaine, 

30 etc. 

The amount of the Therapeutic of the invention 
which will be effective in the treatment of a particular 
disorder or condition will depend on the nature of the 
disorder or condition, and can be determined by standard 
35 clinical techniques. In addition, in vitro assays may 
optionally be employed to help identify optimal dosage 
ranges. The precise dose to be employed in the formulation 
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will also depend on the route of administration, and the 
seriousness of the disease or disorder, and should be decided 
according to the judgment of the practitioner and each 
patient's circumstances. However, suitable dosage ranges for 
5 intravenous administration are generally about 20-500 
micrograms of active compound per kilogram body weight. 
Suitable dosage ranges for intranasal administration are 
generally about 0.01 pg/kg body weight to 1 mg/kg body 
weight. Effective doses may be extrapolated from dose- 
10 response curves derived from in vitro or animal model test 
systems . 

Suppositories generally contain active ingredient 
in the range of 0,5% to 10% by weight; oral formulations 
preferably contain 10% to 95% active ingredient. 

15 The invention also provides a pharmaceutical pack 

or kit comprising one or more containers filled with one or 
more of the ingredients of the pharmaceutical compositions of 
the invention. Optionally associated with such container(s) 
can be a notice in the form prescribed by a governmental 

20 agency regulating the manufacture, use or sale of 

pharmaceuticals or biological products, which notice reflects 
approval by the agency of manufacture, use or sale for human 
administration . 

25 5.13. DIAGNOSTI C UTILITY 

Vertebrate Serrate proteins, analogues, 
derivatives, and subsequences thereof, vertebrate Serrate 
nucleic acids (and sequences complementary thereto), anti- 
vertebrate Serrate antibodies, have uses in diagnostics. 

30 Such molecules can be used in assays, such as immunoassays, 
to detect, prognose, diagnose, or monitor various conditions, 
diseases, and disorders affecting Serrate expression, or 
monitor the treatment thereof. In particular, such an 
immunoassay is carried out by a method comprising contacting 

35 a sample derived from a patient with an anti-Serrate antibody 
under conditions such that immunospecif ic binding can occur, 
and detecting or measuring the amount of any immunospecif ic 
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contacting a sample containing nucleic acid with a nucleic 
acid probe capable of hybridizing to Serrate DNA or RNA, 
under conditions such that hybridization can occur, and 
detecting or measuring any resulting hybridization. 
5 Additionally, since Serrate binds to Notch, 

vertebrate Serrate or a binding portion thereof can be used 
to assay for the presence and/or amounts of Notch in a 
sample, e.g., in screening for malignancies which exhibit 
increased Notch expression such as colon and cervical 
10 cancers. 

6. ISOLATION AND CHARACTERIZATION 
OF A MOUSE SERRATE HOMOLOG 

A mouse Serrate homolog, termed M-Serrate-1, was 

5 isolated as follows: 

Mouse Serrate- 1 gene 

Tissue origin: 10.5-day mouse embryonic RNA 
Isolation method: 

a) random primed cDNA against above RNA 

^ b) PCR of above cDNA using 
2 0 

PCR primer 1: CGI (C/T) TTTGC (C/T) TIAA (A/G) (G/C) AITA (C/T) CA 

(SEQ ID NO: 9) {encoding RLCCK ( H/ E) YQ (SEQ ID NO: 10)}: 

PCR primer 2: TCIATGCAIGTICCICC (A/G) TT (SEQ ID NO: 11) 

{encoding NGGTCID (SEQ ID NO: 12)} 

... Amplification conditions: 50 ng cDNA , 1 ug each primer, 
2 5 r 

0.2 mM dNTP 1 s , 1.8 U Taq ( Perkin-Elmer ) in 50 /il of supplied 

buffer, 40 cycles of: 94°C/30 sec, 45°C/2 min, 72°C/1 min 
extended by 2 sec each cycle. 

3o Yielded a 1.8 kb fragment which was sequenced at both ends 
and identified as corresponding to C-Serrate-1 

Partial DNA sequence of M-Serrate-1: 
From 5' end: 

GTCCCGCGTCACTGCCGGGGGACCCTGCAGCTTCGGCTCAGGGTCTACGCCTGTCATCGGG 
GGTAACACCTTCAATCTCAAGGCCAGCCGTGGCAACGACCGTAATCGCATCGTACTGCCTT 
TCAGTTTCACCTGGCCGAGGTCCTACACTTTGCTGGTGGAG (SEQ ID NO: 13) 
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Protein translation of above: 

SRVTAGGPCSFGSGSTPVIGGNTFNLKASRGNDRNRIVLPFSFTWPRSYTLLVE 
(SEQ ID NO: 14) (corresponds to amino-terminal sequence 
upstream of the DSL domain) 

S 

From 3' end (but coding strand) 

TCTTCTAACGTCTGTGGTCCCCATGGCAAGTGCAAGAGCCAGTCGGCAGGCAAATTCACCT 
GTGACTGTAACAAAGGCTTCACCGGCACCTACTGCCATGAAAATATCAACGACTGCGAGAG 

CAACCCCTGTAAA (SEQ ID NO: 15) 
10 Protein translation of above: 

SSNVCGPHGKCKSQSAGKFTCDCNKGFTGTYCHENINDCESNPCK (SEQ ID NO: 16) 

(within tandemly arranged EGF-like repeats) 

Expression pattern: The expression pattern was determined to 
15 be the same as that observed for C-Serrate-1 (chicken 

Serrate) (see Section 11 infra), including expression in the 
developing central nervous system, peripheral nervous system, 
limb, kidney, lens, and vascular system. 

2C 7 ISOLATION AND CHARACTERIZATION 

nr A XENONS S ERR A W HOMO LOG 

A Xenopus Serrate homolog, termed Xenopus Serrate-1 
was isolated as follows: 
Xenopus Serrate- 1 gene 

Tissue origin: neurula-stage embryonic RNA 

25 

Isolation method: 

a) random primed cDNA against above RNA 

b) PCR using: 

Primer 1: CGI (C/T) TTTGC (C/T) TIAA (A/G) (G/C) AITA ( C/T) CA 
(SEQ ID NO:9) {encoding RLCCK ( H / E ) YQ (SEQ ID NO:10)}: 
" PCR primer 2: TCIATGCAIGTICCICC (A/G) TT (SEQ ID NO: 11) 
{encoding NGGTCID (SEQ ID NO: 12)} 

Amplification conditions: 50 ng cDNA, 1 M each primer, 
0.2 mM dNTP's, 1 . 8 U Tag (Perkin-Elmer ) in 50 M l of supplied 
35 buffer. 40 cycles of: 94-C/30 sec, 45-C/2 min, 72-C/l mm 
extended by 2 sec each cycle. 
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20 



25 



Yielded a -700 bp fragment which was partially sequenced to 
confirm its relationship to C^Serrate-l. 

8. ISOLATION AND CHARACTERIZATION 
OF A CHECK SSfiRATE HOMOLOG 

In the example herein, we report the cloning and 
sequence of a chick Serrate homo log, C-Serrate, and of 
fragments of two chick Notch homologs, C-Notch-1 and 
C-Notch-2, together with their expression patterns during 
early embryogenesis . The patterns of transcription of 
C-Serrate overlaps with that of C-Notch-1 in many regions of 
the embryo, suggesting that C-Notch-1, like Notch in 
Drosophila, is a receptor for Serrate. In particular, Notch 
and Serrate are expressed in the neurogenic regions of the 
developing central and peripheral nervous system. 

Our data show that Serrate, a known ligand of 
Notch, has been conserved from arthropods to chordates . The 
overlapping expression patterns suggest conservation of its 
functional relationship with Notch and imply that development 
of the chick and in particular of its central nervous system 
involves the interaction of C-Notch-1 with Serrate at several 
specific locations. 

Materials and Methods 

Embryos 

White Leghorn chicken eggs were obtained from 
University Park Farm and incubated at 38°C. Embryos were 
staged according to Hamburger and Hamilton (1951, J. Exp. 
Zool. 88:49-92). 



30 



Cloning of chicken homologs of Notch 

Approximately 1000 base pair PCR fragments of the 
chicken Notch 1 and Notch 2 genes were amplified from otic 
explant RNA (see below) using degenerate primers and PCR 
35 conditions as outlined in Lardelli and Lendahl (1993, Exp. 
Cell Res. 204:364-372). The PCR fragment was subcloned into 
Bluescript KS-, sequenced and used as a template for making a 
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DIG antisense RNA probe (RNA Transcription Kit, Stratagene; 
DIG RNA labelling mix, Boehringer Mannheim) . 

Cloning of a chicken homologue of Drosophila Serrate 
5 Otic explants were dissected from embryos of stages 

8 to 13. Each otic explant consisted of the two otic cups, a 
short section of intervening hindbrain and pharynx and the 
associated head ectoderm and mesenchyme. RNA was extracted 
using a modification of standard protocols (Sambrook et al., 

10 1989, in Molecular Cloning: A Laboratory Manual, 2nd ed. , 
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New 
York) and polyA + mRNA was isolated from total RNA using the 
PolyATtract mRNA Isolation System (Promega) . First strand 
cDNA was synthesized using the Superscript Preampl if ication 

15 System (Gibco) . 

PCR and degenerate primers were used to amplify a 
fragment of a chicken gene homologous to the Drosophila gene 
Serrate from the otic explant cDNA. The primers were 
designed to recognize peptide motifs found in both the fly 

2 0 Delta and Serrate proteins: 

1) primer 1, 5-CGI (T/C) TITGC (T/C) TIAA(G/A) (G/C) AITA (C/T) CA- 
3» (SEQ ID NO: 17), corresponds to the motif RLCLK ( E/ H ) YQ 
(SEQ ID NO: 18) located at the amino-terminus of the fly Delta 
and Serrate proteins. 

25 2) primer 2, 5 1 -TCIATGCAIGTICCICC ( A/G) TT-3 * (SEQ ID NO:ll), 
corresponds to the motif NGGTCID (SEQ ID NO: 12) found in 
several of the EGF-like repeats. The PCR conditions were as 
follows: 35 cycles of 94°C for 1 minute, 45°C for 1.5 minutes 
and 72°C for 2 minutes; followed by a final extension step of 

30 72 °C for 10 minutes. A PCR product of approximately 900 base 
pairs in length was purified, subcloned into Bluescript KS- 
(Stratagene) and its DNA sequence partially determined to 
confirm that it was a likely Serrate homolog. It was then 
used to recover larger cDNA clones by screening two cDNA 

35 libraries: 

1) a stage 8-13 otic explant random primed cDNA library 
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2) a stage 17 chick spinal cord oligo dT primed cDNA library 
Overlapping cDNAs were isolated, and two (termed 9 and 3A.1) 
that together cover almost the entire coding region of the 
gene were subcloned into Bluescript KS-. DNA sequence was 
5 determined from nested deletion series generated using the 
double-stranded Nested Deletion Kit (Pharmacia) and Sanger 
dideoxy chain termination method with the Sequenase enzyme 
(US Biochemical Corporation) . Sequences were aligned and 
analyzed using Geneworks 2.3 and Intel ligenetics . Homology 

10 searches were done using the program Sharq. 

To obtain the most 5' end of the open reading 
frame, a number of other PCR based strategies were used 
including the screening of a number of other libraries (CDNA 
and genomic) using the method of Lardelli et al. (1994, 

15 Mechanisms of Development 46:123-136). 

In situ hybridization 

Patterns of gene transcription were determined by 
in situ hybridization using DIG-labeled RNA probes and: 
20 1) a high-stringency wholemount in situ hybridization 
protocol, and 

2) in situ hybridization on cryostat sections based on the 
protocol of StrShle et al. (1994, Trends in Genet. 10:7). 

25 Results 

To obtain insight into the likely role of chick 
Serrate in the vertebrate embryo, we examined its expression 
in relation to that of chick Notch, since functional coupling 
of Notch and Serrate occurs in Drosophila . Two chick Notch 
30 homologs were obtained as described below. 

C-Notch-1 and C- Notch -2 are apparent counterparts of the 
rodent Notch-1 and Notch-2 genes, respectively 

We searched for Notch homologs in the chick by PCR, 
35 using cDNA prepared from two-day chick embryos and degenerate 
primers based on conserved regions common to the known rodent 
Notch homologs. in this way, we obtained fragments, each 
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approximately 1000 nucleotides long, of two distinct genes, 
which we have called C-Notch-1 and C-Notch-2. The fragments 
extend from the third Notch/linl2 repeat up to and including 
the last five or so EGF-like repeats. EGF-like repeats are 
5 present in a large number of proteins, most of which are 

otherwise unrelated to Notch. The three Notch/ linl2 repeats, 
however, are peculiar to the Notch family- of genes and are 
found in all its known members. C-Notch-1 shows the highest 
degree of amino-acid identity with rodent Notch! (Weinmaster 
10 et al.. 1991, Development 113:199-205), and is expressed in 
broadly similar domains to rodent Notchl (see below) . Of the 
rodent Notch genes, C-Notch-2 appears most similar to Notch2 
(Weinmaster et al., 1992, Development 116:931-941). 

We examined the expression patterns of C-Notch-1 in 
15 early embryos by in situ hybridization. C-Notch-1 was 
expressed in the 1- to 2-day chick embryo in many well- 
defined domains, including the neural tube, the presomitic 
mesoderm, the nephrogenic mesoderm (the prospective 
mesonephros) , the nasal placode, the otic placode/vesicle, 
20 the lens placode, the epibranchial placodes, the endothelial 
lining of the vascular system, in the heart, and the apical 
ectodermal ridges (AER) of the limb buds. These sites match 
the reported sites of Notchl expression in rodents at 
equivalent stages (Table II). Taking the sequence data 
25 together with the expression data, we conclude that C-Notch-1 
is either the chick ortholog of rodent Notchl, or a very 
close relative of it. 



Table II 

30 COMPARISON OF DOMAINS OF RODENT-NOTCH1 

rT ^^Zg™? EXP p»ccm W THROUGHOUT FMBRVOGENES1S 

Body Region R-Notchl" C-Sotchl 

primitive streak + + 

35 Hensen's node 

+ + 

neural tube 
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5 



retina 




+ 


lens 




+ 


otic placode/vesicle 


+ 


+ 


epibranchial placodes 






nasal placode 


+ 


+ 


dorsal root ganglia 


+ 




presomitic mesoderm 


+ 


+ 


somites 




+ 


notochord 


7 


+ 


mesonephric kidney 




+ 


metanephric kidney 


+ 


+ 


blood vessels 


+ 


+ 


heart 




+ 






N / A 


thymus 


+ 




toothbuds 




N/A 


salivary gland 






limb bud ( AER) 




+ 



• from Weinmaster et al., 1991, Development 113:199-205; 
Franco del Amo et al., 1992, Development 115:737-744; 
Reaume et al., 1992, Dev. Biol. 154:377-387; Kopan and 
weintraub, 1993, J. Cell. Biol. 121:631-641; Lardelli et 
al., 1994, Mech. of Dev. 46:123-126. 



C- Serrate is a homo log of Drosophila Serrate, and codes for a 
candidate ligand for a receptor belonging to the Notch family 

In Drosophila , two ligands for Notch are known, 
encoded by the two related genes Deita and Serrate. The 

30 amino-acid sequences corresponding to these genes are 
homologous at their 5* ends, including a region, the DSL 
motif, which is necessary and sufficient for in vitro binding 
to Notch. To isolate a fragment of a chicken homolog of 
Serrate, we used PCR and degenerate primers designed to 

35 recognize sequences on either side of the DSL motif (see 
Materials and methods) . A 900 base pair PCR fragment was 
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recovered and used to screen a library, allowing us to 
isolate overlapping cDNA clones. The DNA sequence of the 
cDNA clones revealed an almost complete single open reading 
frame of 3582 nucleotides, lacking only a few 5 1 bases. 
5 Comparison with the amino acid sequences of Drosophila Delta 
and Serrate suggests that we are missing only the portion of 
the coding sequence that encodes part of the signal sequence 
of the chick Serrate protein. 

Translation of the nucleotide sequence 

10 (SEQ ID NO:5) (Fig. 3) predicts a protein of 1230 amino acids 
(SEQ ID NO:6) (Fig. 4). A hydropathy plot reveals a single 
hydrophobic region characteristic of a transmembrane domain 
(Kyte and Doolittle, 1982, J. Mol . Biol. 157:105-132). In 
addition, the protein has sixteen EGF-like repeats organized 

15 in a tandem array in its extracellular domain. Comparison of 
the chick sequence with sequences of D. melanogaster Delta 
and Serrate suggests that the clones encode a chicken homolog 
of Serrate (Fig. 5; Fig. 6) . Whereas Drosophila Serrate 
contains 14 EGF-like repeats with large insertions in repeats 

20 4, 6 and 10, the chicken homolog has an extra two EGF-like 
repeats and only one small insertion of 16 amino acids in the 
10th repeat. Both proteins have a second cysteine-rich 
region between the EGF-like repeats and the transmembrane 
domain; the spacing of the cysteines in this region is almost 

25 identical in the two proteins (compare 

CX 2 CXCX 6 CX 4 CX,,CX 5 CX 7 CX 4 CX 5 C in Drosophila Serrate with 
CX2CXCX 6 CX 4 CX9CX 5 CX 7 CX 4 CX 3 C in C-Serrate) . The intracellular 
domain of C-Serrate bears no significant homology to the 
intracellular domains of either Drosophila Delta or Serrate. 

30 

C-Serrate is expressed in the central nervous system, cranial 
placodes, nephric mesoderm, vascular system, and limb bud 
mesenchyme 

In situ hybridization was performed to examine the 
35 expression of C-Serrate in whole-mount preparations during 
early embryogenesis , from stage 4 to stage 21, at intervals 
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of roughly 12 hours. Later stages were studied by in situ 
hybridization on cryosections . 

The main sites of early expression of C-Serrate, as 
seen in whole mounts, can be grouped under five headings: 
5 central nervous system, cranial placodes, nephric mesoderm, 
vascular system, and limb bud mesenchyme. 

Central nervous system 

The first detectable expression of C-Serrate was 

10 seen in the central nervous system at stage 6 (O somites/ 24 
hrs) , within the posterior portion of the neural plate. By 
stage 10 (9-11 somites/35.5 hrs), a strong stripe of 
expression was seen in the prospective diencephalon. 
Additional faint staining was seen in the hindbrain and in 

15 the prospective spinal cord. 

At stage 13, there were several patches of 
expression in the neural tube. In the diencephalon, there 
was a strong triangular stripe of expression that appeared to 
correspond to neuromere D2 . There were two patches (one on 

20 either side of the midline) on the floor of the anterior 
mesencephalon as well as diffuse staining in the dorsal 
mesencephalon. In the hindbrain and rostral spinal cord, 
there were two longitudinal stripes of expression on either 
side of the midline: one along the dorsal edge of the neural 

25 tube and a second more ventral one, adjacent to the floor 
plate. Both were located within the domain of (rat) Notch 1 
expression. The anterior limit of the ventral stripe was at 
the midbrain/hindbrain boundary. The dorsal stripe was 
continuous with the expression in the dorsal mesencephalon. 

30 In the anterior spinal cord, expression was more spotty, the 
stripes being replaced by isolated scattered cells expressing 
C-Serrate . 

At stage 17 (58 hrs) , expression in the 
diencephalon and midbrain was unchanged. In the hindbrain 
35 and spinal cord, there were an additional two longitudinal 
stripes: one midway along the dorsoventral axis and a second 
wider more ventral stripe; the anterior limits of these 
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stripes coincided with the anterior border of rhombomere 2. 
All four longitudinal stripes in the hindbrain continued into 
the spinal cord of the embryo; decreasing towards its 
posterior end. These stripes of expression were maintained 
5 at least up to and including stage 31 (E7) . By stage 21 (84 
hrs) , additional expression was seen in the cerebral 
hemispheres and strong expression in a salt and pepper 
distribution of cells in the optic tectum. 

10 Cranial placodes 

It is striking that c-Serrate is expressed in all 
the cranial placodes - the lens placode, the nasal placode, 
the otic placode/vesicle and the epibranchial placodes, as 
well as a patch of cranial ectoderm anterior to the otic 

15 placode that may correspond to the trigeminal placode (which 
is not well-defined morphologically) . 

In the lens placode, expression was already seen at 
stage 11, rapidly became very strong, and persisted at least 
to stage 21. Expression was weaker in the nasal placode and 

20 was only detected from stage 13. Again, expression was 
maintained at least until stage 21. 

Likewise for the otic placode, expression began to 
be visible at stage 10 and was strong by early stage 11 (12- 
14 somites, 42.5 hours). Curiously, there was a "hole" in 

25 the otic expression domain - an anteroventral region of the 
placode in which the gene was not expressed. Subsequently, 
as the placode invaginates to form an otic vesicle, the 
strongest expression was seen at the anterolateral and 
posteromedial poles. Later still, as the otic vesicle 

30 becomes transformed into the membranous labyrinth of the 
inner ear, C-Serrate expression became restricted to the 
sensory patches. 

The epibranchial expression was seen at stage 13/14 
as strong staining in the ectoderm around the dorsal margins 

35 of the first and second branchial clefts. It was accompanied 
by expression of the gen in the deep part of the lining of 
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the clefts and in the endodermal lining of the branchial 
pouches, where the two epithelia abut one another. 

Lastly, a large and strong but transient patch of 
expression was seen in the cranial ectoderm just anterior and 
5 ventral to the ear rudiment at stage 11. From its location, 
we suspect this to be, or to include, the region of the 
trigeminal placode. 

Nephric mesoderm 
10 Expression was detectable in the cells of the 

intermediate mesoderm from stage 10 and in older embryos 
(stage 17 to 21) in the developing mesonephric tubules. 

Limb buds 

15 C-Serrate mKNA was localized to a patch of mesenchyme at the 
distal end of the developing limb bud. This may suggest a 
role in limb growth. 

Other sites 

20 Expression was also seen in the tail bud, allantoic stalk, 
and possibly other tissues at late stages. 

All major sites of C- Serrate expression lie within domains of 
C-Notch-1 expression 

25 The conservation of the DSL domain and adjacent N- 

terminal region in C-Serrate suggests that it functions as a 
ligand for a receptor belonging to the Notch family. We thus 
expected to find sites where C-Serrate expression is 
accompanied by expression of a Notch gene. At such sites, 

30 overlapping or contiguous expression of the two genes can be 
taken as an indication that cells are communicating by 
Serrate-Notch signalling. We have compared the expression 
pattern of C-Serrate, as shown by in situ hybridization, with 
that of C-Notch-1 , to discover what overlaps in fact occur, 

35 over a range of stages up to 8 days of incubation (E8) . All 
the observed sites of C-Serrate expression indeed lay within, 
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or very closely adjacent to, domains of expression of 
C-Notch-1 (Table III) . 



Table III 



COMPARISON OF C-NOTCH-1 AND 
C^gl^E ryPFFSSTOH AT STAGE 12« 



10 



15 



20 



25 



Body region 

brain and spinal cord 

retina 

lens 

otic placode/vesicle 
epibranchial placodes 
nasal placode 
dorsal root ganglia 
branchial mesenchyme 
branchial ectoderm 
branchial endoderm 
preeomitic mesoderm 
somites 
notochord 

mesonephric kidney 
metanephric kidney 
blood vessels 
heart 

limb bud (Btage 21) 



C-Wotch-1 
►+ (almost everywhere) 



C- Serrate 

(specific regions) 



+ 

+♦ 
++ 
++ 
+ 



++ 
++ 



(furrows) 
++ (tips of pouches) 



++ (AER) 



(distal mesenchyme) 



, ot1 i Ext5 . zool. 88:49-92 
a Hamburger and Hamilton, 1951. J. Exp. 



30 



35 



Because of the importance of Notch and its partners 
Because o partic ular interest to us 

in insect neurogenesis, it was ot P d in the 

to see whether the hcoXogous genes - * ^ ^ 

aeveiopment of the vertebrate CNS^ C Serr able 
the CNS, and its pattern of expression shows 

of the Notch homologs. 
relationship to that of through t be spina! 

We analyzed transverse Wn £ch-1 
cord of a six day chic.en embryo hybridized with C Notch 
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and C-Serrate antisense RNA probes. C~Notch-l was expressed 
throughout the luminal region as described previously; within 
this region, there were two small patches in which Serrate 
was strongly expressed. 

5 

Discussion 

In Drosophila development, cell-cell signalling via 
the product of the Notch gene plays a cardinal role in the 
final cell-fate decisions that specify the detailed pattern 

10 of differentiated cell types. This signalling pathway, in 
which the Notch protein has been identified as a 
transmembrane receptor, is best known for its role in 
neurogenesis: loss-of -function mutations in Notch or any of a 
set of other genes required for signal transmission via Notch 

15 alter cell fates in the neuroectoderm, causing cells that 
should have remained epidermal to become neural instead. 
Notch-dependent signalling is, however, as important in non- 
neural as in neural tissues. It regulates choices of mode of 
differentiation in oogenesis, in myogenesis, in formation of 

20 the Malpighian tubules and in the gut, for example, as well 
as in development of the retina, the peripheral sensilla, and 
the central nervous system. In most of these cases the 
signal delivered via Notch appears to mediate lateral 
inhibition, a type of interaction by which a cell that 

25 becomes committed to differentiate in a particular way - for 
example, as a neuroblast - inhibits its immediate neighbors 
from doing likewise. This forces adjacent cells to behave in 
contrasting ways, creating a fine-grained pattern of 
different cell types. 

30 There are, however, good reasons to believe that 

this is not the only function of signals delivered via Notch. 
Two direct ligands of Notch have been identified. These are 
the products of the Delta and Serrate genes. Both of them, 
like Notch itself, code for transmembrane proteins with 

35 tandem arrays of EGF-like repeats in their extracellular 
domain. Both the Delta and the Serrate protein have been 
shown to bind to Notch in a cell adhesion assay, and they 
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share a large region of homology at their ami no- termini 
including a motif that is necessary and sufficient for 
interaction with Notch in vitro, the so-called EBD or DSL 
domain. Yet despite these biochemical similarities, they 
5 seem to have quite different developmental functions. 

Although Serrate is expressed in many sites in the fly, it is 
apparently required only in the humeral, wing and halteres 
disks. When Serrate function is lost by mutation, these 
structures fail to grow. Studies on the wing disc have 

10 indicated that it is specifically the wing margin that 

depends on Serrate; when Serrate is lacking, this critical 
signaling region and growth centre fails to form, and when 
Serrate is expressed ectopically under a GAL4-UAS promoter in 
the ventral part of the wing disc, ectopic wing margin tissue 

15 is induced, leading to ectopic outgrowths. Notch appears to 
be the receptor for Serrate at the wing margin, since some 
mutant alleles of Notch cause similar disturbances of wing 
margin development and allele-specif ic interactions are seen 
in the effects of the two genes. 

20 Here we describe the identification and full length 

sequence of a homolog of the Drosophila gene Serrate, and 
identification and partial sequence of chick homologs of 
rat/mouse Notchl and Notch2 . 

Within the chick Serrate cDNA there is a single 

2 5 open reading frame predicted to encode a large transmembrane 
protein with 16 EGF repeats in its extracellular domain. It 
has a well conserved DSL motif suggesting that it would 
interact directly with Notch. The intracellular domain of 
chick Serrate exhibits no homology to anything in the current 

30 databases including the intracellular domains of Drosophila 
Delta and Serrate. It should he pointed out however that the 
intracellular domains of chick and human Serrate (see Section 
12) are almost identical. 

The spatial distributions of C-Notch-1 and 

35 C-Serrate were investigated during early embryogenesis by in 
situ hybridization. C-Notch-1 and C-Serrate exhibit dynamic 
and complex patterns of expression including several regions 
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in which they are coexpressed (CNS, ear, branchial region, 
lens, heart, nasal placodes and mesonephros) . The 
overlapping expression together with the finding that 
C-Serrate has a well conserved Notch binding domain suggests 
5 that this receptor/ ligand interaction has been conserved from 
Drosophila through to vertebrates. 

In Drosophila , the Notch receptor is quite widely 
distributed and its ligands are found in overlapping but more 
restricted domains. In the chick a similar situation is 
10 observed . 

Fly Notch is necessary for many steps in the 
development of Drosophila; its role in lateral inhibition 
especially in the development of the central nervous system 
and peripheral sense organs being the best studied examples. 

15 However, Notch is a multifunctional receptor and can interact 
with different signalling molecules (including Delta and 
Serrate) and in developmental processes that do not easily 
fit within the framework of lateral inhibition. While 
available evidence implicates Delta as the signalling 

20 molecule in lateral inhibition there is no data to suggest 
that Serrate participates in lateral inhibition. Rather, 
Serrate appears to be necessary for development of the dorsal 
imaginal discs of the larva; that is, the humeral, haltere 
and wing discs. In the latter, the best studied of these 

25 processes. Serrate and Notch are important for the 

development of the dorsoventral wing margin, a structure 
necessary for the organization of wing development as a 
whole. 

That C-Serrate has a significant function can be 
30 inferred from the conservation of its sequence, in 

particular, of its Notch-binding domain. The expression 
patterns reported for C-Serrate in this paper provide the 
following information. First, since the Serrate gene is 
expressed in or next to sites where C-Wotch-l is expressed 
35 (possibly in conjunction with other Notch homologs) , it is 
highly probable that C-Serrate exerts its action by binding 
to C-Notch-1 (or to another chick Notch homolog with a 
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similar expression pattern). Second, the expression in the 
developing kidney, the vascular system and the limb buds 
might reflect an involvement in inductive signalling between 
mesoderm and ectoderm, which plays an important part in the 
5 development of all these organs. In the limb buds, for 
example. C-Serrate is expressed in the distal mesoderm, and 
C-Notch-1 is expressed in the overlying apical ectodermal 
ridge, whose maintenance is known to depend on a signal from 
the mesoderm below. In the cranial placodes, a similar role 
X0 is possible, but the evidence for inductive signalling is 
weaker, and C-Serrate may equally be involved in 
communications between cells within the placodal epithelium, 
for example, in regulating the specialized modes of 
differentiation of the placodal calls. 
15 what might C-Serrate -s function be within the 

curiously restricted domains of its expression in the CMS? 
one possibility is that it is involved in regulating the 
production of oligodendrocytes, which have likewise been 
reported to originate from narrow bands of tissue extending 
20 along the cranio-caudal axis of the neural tube. 

9 ISOLATION AND CHARACTERIZATION 
O f HITMAN "PPftTT HOMOLOGS 

Clones for the human Serrate sequence were obtained 

. as described below. 
" The polymerase chain reaction (PCR) was used to 

amplify DNA from a human placenta cDNA library. Degenerate 
oligonucleotide primers used in this reaction were designed 
based on amino-terminal regions of high homology 
Drosophila Serrate and Drosophila Delta (see Fig. 5, ; this 

30 high homology region includes the 5' -DSL- ^ 
believed to code for the Notch-binding portion of 
Serrate. Two PCR products were isolated and used, one a 350 
op fragment, and one a 1.2 kb fragment. These PCR ragmen.s 

„ were labeled with -P and used to screen a commerc al human 

" fetal brain cDNA library made from a 17-18 week old fetus 
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(previously available from Stratagene) , in which the cDNAs 
were inserted into the EcoRl site of a X-Zap vector. 

The 1.2 kb fragment hybridized to a single clone 
out of the 10* clones screened. We rescued this fragment from 
5 the X DNA by converting the isolated phage X clone to a 
plasmid via the manufacturer's instructions, yielding the 
Serrate-homologous cDNA as an insert in the JScoRI site of the 
vector Bluescript KS- (Stratagene) . This plasmid was named 
"pBS39 H and the gene corresponding to this cDNA clone was 
10 called Human Serrate-1 (also known as Human Jagged-1 

( "JWI " ) ) . The isolated cDNA was 6464 nucleotides long and 
contained a complete open reading frame as well as 5' and 3' 
untranslated regions (Fig. 1). Sequencing was carried out 
using the Sequenase® sequencing system (U.S. Biochemical 
15 Corp.) on 5 and 6% Sequagel acrylamide sequencing gels. 

The 350 bp fragment hybridized with two clones, 
containing cDNA inserts of approximately 1.1 and 3.1 kb in 
length; the plasmid constructs containing these inserts were 
named pBS14 and pBS15, respectively. Each clone was 
20 isolated, its respective insert rescued from the X cDNA, and 
sequenced as above. The nucleotide sequence of the pBS14 
insert was identical to a 1 . 1 kb stretch of sequence 
contained internally within the pBS15 cDNA insert and 
therefore, this clone was not characterized further. The 
25 sequence of the 3 . 1 kb pBSIS insert encoded a single open 
reading frame which spanned all but the 5' 20 nucleotides of 
the insert. The methionine located at the amino terminal 
residue of this predicted open reading was homologous to the 
start methionine encoded by the Human Serrate- J (HJ1) cDNA 
30 clone in pBS39. The gene encoding the cDNA insert of pBSl5 
was named Human Serrate-2 and is also known as Human Jagged-2 
("HJ2 n ) . 

The pBS15 (HJ2) 3 . 1 kb insert was then labeled with 
32 P and used to screen another human fetal brain library (from 
35 Clontech) , in which cDNA generated from a 25-26 week-old 
fetus was cloned into the EcoRZ site of Xgtll. This screen 
identified three potential positive clones. To isolate the 
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cDNAs, Xgtll DNA was prepared from a liquid lysate and 
purified over a DEAE column. The purified DNA was then cut 
with £coRI and the cDNA inserts were isolated and subcloned 
into the EcoRI site of Bluescript KS-. The bluescript 
5 constructs containing these cDNAs were named pBS3-15, pBS3-2, 
and pBS3-20. Two of these cDNA clones, pBS3-2 and pBS3-20, 
contained sequences that partially overlapped with pBSIS and 
were further characterized. pBS3-2 had a 3.2 kb insert 
extending from nucleotide 1210 of the pBSIS cDNA insert to 

10 just after the polyadenylation signal. The 2.6 kb insert of 
pBS3-20, was restriction mapped and partially sequenced to 
determine its 3' and 5' ends. This analysis indicated that 
the PBS3-20 insert had a nucleic acid sequence that was fully 
contained within the pBS3-2 cDNA insert and therefore, the 

15 pBS3-20 insert was not characterized further. The insert of 
pBS3-15 was determined to be a Bluescript vector fragment 

contaminant. 

Alignment of the deduced amino acid sequence 
(SEQ ID NO:4) of the "complete" Human Serrate-2 (HJ2) cDNA 

2 0 (SEQ ID NO: 3) generated on the computer with the deduced 
amino acid sequence of Human Serrate-1 (HJ1) from pBS39 
(SEQ ID NO:2) revealed a gap of about 120 bases, leading to a 
frameshift, in the region encoded by the P BS15 (HJ2) insert, 
between the putative signal sequence and the beginning of the 

25 DSL domain (Fig. 2). The nucleotides missing in the gap of 
the pBS15 insert would be located between nucleotides 240 and 
241 of SEQ ID NO: 3. This missing region probably resulted 
from a cloning artifact in the construction of the Stratagene 
library. 

30 Attempts to clone the 5' end of HJ2 using anchored 

PGR, RACE , and Takara extended PCR techniques were 
unsuccessful. However, three human genomic clones 
potentially containing the 5' end of HJ2 were obtained from 
the screening of a human genomic cosmid library in which 30 

35 kb fragments were cloned into a unique Xhol site introduced 
into the Ba/nHI site of a P WE15 vector (the unmodified v ctor 
is available from Stratagene). This cosmid library was 

- 81 - 



WO 96/27610 



PCT7US96/03172 



screened with a PCR fragment that had been amplified from the 
5' end of pBS15 (HJ2) and three positive cosmid clones were 
isolated. Two different sets of primers were used to amplify 
DNA corresponding to the 5' end of pBS15 using the cosmid 
5 clones as a template, and both sets generated single bands 
that were subcloned, but which were determined to contain PCR 
artifacts. Portions of the cosmid clones are being subcloned 
directly without PCR, in order to obtain a portion of the 
cosmid clones that contains the 120 nucleotide stretch of DNA 

10 that is missing from pBS15. 

The pBS39 cDNA insert, encoding the Human Serrate-1 
homolog (HJ1) , has been sequenced and contains the complete 
coding sequence for the gene product. The nucleotide 
( SEQ ID NO:l) and protein (SEQ ID NO: 2) sequences are shown 

15 in Figure 1. The nucleotide sequence of Human Serrate-1 

(HJ1) was translated using MacVector software (International 
Biotechnology Inc. , New Haven, CT) . The coding region 
consists of nucleotide numbers 371-4024 of SEQ ID NO:l. The 
Protean protein analysis software program from DNAStar 

20 (Madison, WI) was used to predict signal peptide and 

transmembrane regions (based on hydrophobicity ) . The signal 
peptide was predicted to consist of amino acids 14-29 of 
SEQ ID NO: 2 (encoded by nucleotide numbers 410-457 of 
SEQ ID NO:l), whereby the amino terminus of the mature 

25 protein was predicted to start with Gly at amino acid number 
30. The transmembrane domain was predicted to be amino acid 
numbers 1068-1089 of SEQ ID NO: 2, encoded by nucleotide 
numbers 3572-3637 of SEQ ID NO:l. The consensus (DSL) 
domain, the region of homology with Drosophila Delta and 

30 Serrate, predicted to mediate binding with Notch (in 

particular, Notch ELR 11 and 12), spans amino acids 185-229 
of SEQ ID NO: 2, encoded by nucleotide numbers 923-1057 of 
SEQ ID NO:l. Epidermal growth factor-like (ELR) repeats in 
the amino acid sequence were identified by eye; 15 (full- 

35 length) ELRs were identified and 3 partial ELRs as follows: 
ELR 1: amino acid numbers 234 - 264 
ELR 2: amino acid numbers 265 - 299 
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ELR 3: amino acid numbers 300 - 339 
ELR 4: amino acid numbers 340 - 377 
ELR 5: amino acid numbers 378 - 415 
ELR 6: amino acid numbers 416 - 453 
5 ELR 7: amino acid numbers 454 - 490 

ELR 8: amino acid numbers 491 - 528 
ELR 9: amino acid numbers 529 - 566 
Partial ELR: amino acid numbers 567 - 598 
Partial ELR: amino acid numbers 599 - 632 
10 ELR 10: amino acid numbers 63 3 - 67 0 

ELR 11: amino acid numbers 671 - 708 
ELR 12: amino acid numbers 709 - 747 
ELR 13: amino acid numbers 748 - 785 
ELR 14: amino acid numbers 786 - 823 
15 ELR 15: amino acid numbers 824 - 862 

Partial ELR: amino acid numbers 863 - 879 
Partial ELR: amino acid numbers 880 - 896 
The total ELR domain is thus amino acid numbers 234 - 896 
(encoded by nucleotide numbers 1070 - 3058 of SEQ ID NO:l). 
20 The extracellular domain is thus predicted to be amino acid 
numbers 1 - 1067 of SEQ ID NO: 2, encoded by nucleotide 
numbers 371 - 3571 of SEQ ID NO:l (amino acid numbers 
30 - 1067 in the mature protein; encoded by nucleotides 
number 458 - 3571 of SEQ ID N0:1). The intracellular 
25 (cytoplasmic) domain is thus predicted to be amino acid 
numbers 1090 - 1218 of SEQ ID NO: 2, encoded by nucleotide 
numbers 3638 - 4024 of SEQ ID NO:l- 

The expression of HJ1 in certain human tissues was 
established by probing a Clontech Human Multiple Tissue 
30 Northern blot with radio-labeled P BS39. The probe hybridized 
to a single band of about 6.6 kb, and was expressed in all of 
the tissue assayed, which included, heart, brain, placenta, 
lung, skeletal muscle, pancreas, liver and kidney. The 
observation that HJ1 was expressed in adult skeletal and 
35 heart muscle was particularly interesting, because adult 
muscle fibers are completely surrounded by a lamina of 
extracellular matrix, and it is unlikely, therefore, that the 
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role of HJ1 in these cells is in direct cell-cell 
communication. 

The "complete" (containing an internal deletion) 
Human Serrate-2 (HJ2) cDNA nucleotide sequence (SEQ ID NO: 3) 
5 and amino acid sequence (SEQ ID NO: 4) generated on the 
computer are shown in Figure 2. The nucleotide sequence 
translated using MacVector software (International 
Biotechnology Inc., New Haven, CT) . The coding region 
consists of nucleotides number 332 - 4102 of SEQ ID NO: 3. 

10 The Protean protein analysis software program from DNAStar 
(Madison, WI) was used to predict signal peptide and 
transmembrane regions (based on hydrophobicity) . The 
transmembrane domain was predicted to be amino acid numbers 
912-933 of SEQ ID NO:4, encoded by nucleotides numbers 

15 3065-3130 of SEQ ID NO: 3. The consensus (DSL) domain, the 
region of homology with Drosophila Delta and Serrate, 
predicted to mediate binding with Notch (in particular, Notch 
ELR 11 and 12), spans amino acids 26-70 of SEQ ID NO:4, 
encoded by nucleotide numbers 407 - 541 of SEQ ID NO: 3. 

20 Epidermal growth factor-like (ELR) repeats in the amino acid 
sequence were identified by eye; 15 (full-length) ELRs were 





identified and 3 


partial 


. ELRs as follows: 






ELR 


1 : 


amino 


acid 


numbers 


75 - 


105 




ELR 


2 : 


amino 


acid 


numbers 


106 


- 140 


25 


ELR 


3 : 


amino 


acid 


numbers 


141 


- 180 




ELR 


4 : 


amino 


acid 


numbers 


181 


- 218 




ELR 


5: 


amino 


acid 


numbers 


219 


- 256 




ELR 


6: 


amino 


acid 


numbers 


257 


- 294 




ELR 


7: 


amino 


acid 


numbers 


295 


- 331 


30 


ELR 


8 : 


amino 


acid 


numbers 


332 


- 369 




ELR 


9: 


amino 


acid 


numbers 


370 


- 407 



Partial ELR: amino acid numbers 408 - 435 
Partial ELR: amino acid numbers 436 - 469 
ELR 10: amino acid numbers 470 - 507 
35 ELR 11: amino acid numbers 508 ~ 54 5 

ELR 12: amino acid numbers 546 - 584 
ELR 13: amino acid numbers 585 - 622 
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ELR 14: amino acid numbers 623 - 660 
ELR 15: amino acid numbers 664 - 701 
Partial ELR: amino acid numbers 702 - 718 
Partial ELR: amino acid numbers 719 - 735 
5 The total ELR domain is thus amino acid numbers 7 5 - 73 5 
(encoded by nucleotides number 554 - 2536 of SEQ ID NO:3). 
The extracellular domain is thus predicted to be amino acid 
numbers 1 - 912 of SEQ ID NO:4, encoded by nucleotides number 
332 - 3064 of SEQ ID NO: 3. The intracellular (cytoplasmic) 
10 domain is thus predicted to be amino acid numbers 934 - 1257 
of SEQ ID NO:4, encoded by nucleotide numbers 3131 - 4102 of 
SEQ ID NO: 3. 

Like Human Serrate-1 (HJ1) , the "complete" (with an 
internal deletion) Human Serrate-2 (HJ2) cDNA (SEQ ID NO: 3) 
15 generated on the computer encodes a protein containing 16 
complete and 2 interrupted EGF repeats as well as the 
diagnostic cryptic EGF repeat known as the DSL domain, which 
has been found only in putative Notch ligands. The open 
reading frame of the computer generated "complete" Human 

2 0 Serrate-2 (HJ2) is about 14 00 amino acids long, approximately 

182 amino acids longer than the carboxy terminus of HJ1 and 
the rat Serrate homologue Jagged. While there is significant 
homology between the complete HJ2 and HJ1 in the amino 
terminal portion of the protein, this homology is lost just 
25 before the putative transmembrane domain at about amino acid 
number 1029 of HJ1. This result is particularly interesting 
because the presence of a long COOH-terminal tail implies the 
possibility of some additional function or regulation of HJ2 . 

The "complete" (with an internal deletion) Human 

3 0 Serrate-2 {HJ2 ) cDNA (SEQ ID NO: 3) sequence can be 

constructed by taking advantage of the unique restriction 
sites for AccI, Drain, or Ba/nHI present in the sequence 
overlap of pBS15 and pBS3-2, and which enzymes cleave the 
pBSIS insert at nucleotides 1431, 2648, and 2802, 
35 respectively. 
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The expression of HJ2 in certain human tissues was 
established by probing a Clontech Human Multiple Tissue 
Northern blot with radio-labeled clone pBS15. This probe 
hybridized to a single band of about 5.2 kb and was expressed 
5 in heart, brain, placenta, lung, skeletal muscle, and 

pancreas, but was absent or nearly undetectable in liver and 
kidney. As in the case of HJ1 expression discussed supra, 
the observation that the pBS15 insert component of HJ2 was 
expressed in adult skeletal and heart muscle was particularly 

10 interesting, because adult muscle fibers are completely 
surrounded by a lamina of extracellular matrix, and it is 
unlikely, therefore, that the role of HJ2 in these cells is 
in direct cell-cell communication. 

Expression constructs are made using the isolated 

15 clone (s) . The clone is excised from its vector as an £coRI 
restriction fragment(s) and subcloned into the EcoRI 
restriction site of an expression vector. This allows for 
the expression of the Human Serrate protein product from the 
subclone in the correct reading frame. Using this 

2 0 methodology, expression constructs in which the HJ1 cDNA 
insert of pBS39 was cloned into an expression vector for 
expression under the control of a cytomegalovirus promoter 
have been generated and HJ1 has been expressed in both 3T3 
and HAKAT human keratinocyte cell lines. 

25 

10. DEPOSIT O F MICROORGANISMS 
Plasmid pBS39, containing an £coRI fragment 
encoding full-length Human Serrate-1 (HJ1) , was deposited on 
February 28, 1995 with the American Type Culture Collection, 
30 1201 Parklawn Drive, Rockville, Maryland 20852, under the 
provisions of the Budapest Treaty on the International 
Recognition of the Deposit of Microorganisms for the Purposes 
of Patent Procedures, and assigned Accession No. 97068. 

Plasmid pBS15, containing a 3 . 1 kb £coRI fragment 
35 encoding the amino terminus of Human Serrate-2 (HJ2) , cloned 
into the £coRI site of Bluescript KS- , was deposited on March 
5, 1996 with the American Type Culture Collection, 1201 
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Parklawn Drive, Rockville, Maryland 20852, under the 
provisions of the Budapest Treaty on the International 
Recognition of the Deposit of Microorganisms for the Purposes 

of Patent Procedures, and assigned Accession No. . 

5 Plasmid pBS3-2 containing an 3.2 kb EcoRl fragment 

encoding the carboxy terminus of Human Serrate-2 (HJ2) , 
cloned into the EcoM site of Bluescript KS-, was deposited 
on March 5, 1996 with the American Type Culture Collection, 
1201 Parklawn Drive, Rockville, Maryland 20852, under the ' 
10 provisions of the Budapest Treaty on the International 

Recognition of the Deposit of Microorganisms for the Purposes 
of Patent Procedures, and assigned Accession No. . 

The present invention is not to be limited in scope 
15 by the microorganisms deposited or the specific embodiments 
described herein. Indeed, various modifications of the 
invention in addition to those described herein will become 
apparent to those skilled in the art from the foregoing 
description and accompanying figures. Such modifications are 
20 intended to fall within the scope of the appended claims. 

Various references are cited herein, the 
disclosures of which are incorporated by reference in their 
entireties. 



25 



30 



35 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: ISH-HOROWIC2 , DAVID 

HENRIQUE, DOMINCOS MANUEL PINTO 

LEWIS, JULIAN HART 

MYAT, ANNA MARY 

ARTAVAN I S -TS AKONAS , SPYRIDON 

MANN, ROBERT S. 

GRAY, GRACE E. 

(iii) NUMBER OF SEQUENCES: 18 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Pennie & Edmonds 

(B) STREET: 1155 Avenue of the Americas 

(C) CITY: New York 

(D) STATE: New York 

(E) COUNTRY: USA 

(F) ZIP: 10036-2711 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE : Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

<A) APPLICATION NUMBER: To Be Assigned 

(B) FILING DATE: On Even Date Herewith 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Misrock, S. Leslie 

(B) REGISTRATION NUMBER: 18,872 

(C) REFERENCE /DOCKET NUMBER: 7326-037-228 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (212) 790-9090 

(B) TELEFAX: (212) 869-9741/8864 

(C) TELEX: 66141 PENNIE 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6464 base pairs 

(B) TYPE: nucleic acid 

(C) 5TRANDEDNESS : double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 371.. 4027 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GAATTCCCCT CCCCCCTTTT TCCATGCAGC TGATCTAAAA GGGAATAAAA GGCTGCGCAT 60 

AATCATAATA ATAAAAGAAG GGGAGCGCGA GAGAAGGAAA GAAAGCCGGG AGG TGG AAG A 120 

GGAGGGGGAG CGTCTCAAAG AAGCGATCAG AATAATAAAA GGAGGCCGGG CTCTTTGCCT 180 

TCTGGAACGC GCCGCTCTTG AAAGGGCTTT TG AAAAG TGG TGTTGTTTTC CAGTCGTCCA 240 

TGCTCCAATC GGCGGAGTAT ATTAGAGCCG GGACGCGGCC GCAGGGGCAG CGGCGACGGC 300 

AG C ACCGGCG CCAGCACCAG CGCGAACAGC AGCGGCGGCG TCCCGAGTGC CCGCGGCGGC 360 

GCGCGCAGCG ATG CGT TCC CCA CGG ACA CGC GGC CGG TCC GGG CGC CCC 409 
Met Arg Ser Pro Arg Thr Arg Gly Arg Ser Gly Arg Pro 
15 10 

CTA AGC CTC CTG CTC GCC CTC CTC TGT GCC CTG CGA GCC AAG CTG TGT 457 
Leu Ser Leu Leu Leu Ala Leu Leu Cys Ala Leu Arg Ala Lys Val Cvs 
15 20 25 

GGG GCC TCG GGT CAG TTC GAG TTG GAG ATC CTG TCC ATG CAG AAC GTG 505 
Gly Ala Ser Gly Gin Phe Glu Leu Glu He Leu Ser Met Gin Asn Val 
30 35 40 45 

AAC GGG GAG CTG CAG AAC GGG AAC TGC TGC GGC GGC GCC CGG AAC CCG 553 
Aen Gly Glu Leu Gin Aen Gly Aen Cye Cys Gly Gly Ala Arg Asn Pro 
50 55 60 

GGA GAC CGC AAG TGC ACC CGC GAC GAG TGT GAC ACA TAC TTC AAA GTG 601 
Gly Asp Arg Lys Cys Thr Arg Asp Glu Cys Asp Thr Tyr Phe Lys Val 
65 70 75 

TGC CTC AAG GAG TAT CAG TCC CGC GTC ACG GCC GGG GGG CCC TGC AGC 649 
Cys Leu Lys Glu Tyr Gin Ser Arg Val Thr Ala Gly Gly Pro Cys Ser 
80 85 90 

TTC GGC TCA GGG TCC ACG CCT GTC ATC GGG GGC AAC ACC TTC AAC CTC 697 
Phe Gly Ser Gly Ser Thr Pro Val He Gly Gly Asn Thr Phe Asn Leu 
95 100 105 

AAG GCC AGC CGC GCC AAC GAC CCG AAC CGC ATC GTG CTG CCT TTC ACT 745 
Lys Ala Ser Arg Gly Asn Asp Pro Asn Arg He Val Leu Pro Phe Ser 
HO 115 120 125 

TTC GCC TGG CCG AGG TCC TAT ACG TTG CTT GTG GAG GCG TGG GAT TCC 793 
Phe Ala Trp Pro Arg Ser Tyr Thr Leu Leu Val Glu Ala Trp Asp Ser 
130 135 140 

AGT AAT GAC ACC GTT CAA CCT GAC AGT ATT ATT GAA AAG GCT TCT CAC 841 
Ser Asn Asp Thr Val Gin Pro Asp Ser He He Glu Lys Ala Ser His 
145 150 155 

TCG GGC ATG ATC AAC CCC AGC CGG CAG TGG CAG ACG CTG AAG CAG AAC 889 
Ser Gly Met He Abo Pro Ser Arg Gin Trp Gin Thr Leu Lys Gin Asn 
160 165 170 

ACG GGC GTT GCC CAC TTT CAG TAT CAG ATC CGC GTG ACC TGT GAT GAC 937 
Thr Gly Val Ala His Phe Glu Tyr Gin He Arg Val Thr Cys Asp Asp 
175 180 165 

TAC TAC TAT GGC TTT GGC TGT AAT AAG TTC TGC CGC CCC AGA GAT GAC 985 
Tyr Tyr Tyr Gly Phe Gly Cys Asn Lys Phe Cys Arg Pro Arg Asp Asp 
190 195 200 205 

TTC TTT GGA CAC TAT GCC TGT GAC CAG AAT GGC AAC AAA ACT TGC ATG 103 3 
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Phe Phe Gly Hie Tyr Ala Cys Aep Gin Aen Gly Asn Lys Thr Cys Met 
210 215 220 

GAA CGC TGG ATG GGC CCC GAA TGT AAC AG A GCT ATT TGC CGA CAA GGC 1081 
Glu Gly Trp Met Gly Pro Glu Cys Asn Arg Ala He Cys Arg Gin Glv 
225 230 235 

TGC AGT CCT AAG CAT GGG TCT TGC AAA CTC CCA GGT GAC TGC AGG TGC 112 9 

Cys Ser Pro Lys His Gly Ser Cys Lye Leu Pro Gly Asp Cys Arg Cys 
240 245 250 

CAG TAC GGC TGG CAA GGC CTG TAC TGT GAT AAG TGC ATC CCA CAC CCG 1177 
Gin Tyr Gly Trp Gin Cly Leu Tyr Cys Asp Lys Cys He Pro His Pro 
255 260 265 

GCA TGC GTC CAC GGC ATC TGT AAT GAG CCC TGG CAG TGC CTC TGT CAG 122 5 

Gly Cys Val His Gly He Cys Aen Glu Pro Trp Gin Cys Leu Cys Glu 
270 275 280 285 

ACC AAC TGC GGC GGC CAG CTC TGT GAC AAA GAT CTC AAT TAC TGT GGG 127 3 

Thr Asn Trp Gly Gly Gin Leu Cys Asp Lys Asp Leu Asn Tyr Cys Gly 
290 295 300 

ACT CAT CAG CCG TGT CTC AAC GGG GGA ACT TGT AGC AAC ACA GGC CCT 1321 
Thr Hie Gin Pro Cys Leu Asn Gly Gly Thr Cys Ser Asn Thr Gly Pro 
305 310 315 

GAC AAA TAT CAG TGT TCC TGC CCT GAG GGG TAT TCA GGA CCC AAC TGT 1369 
Asp Lys Tyr Gin CyB Ser Cys Pro Glu Gly Tyr Ser Gly Pro Asn Cys 
320 325 330 

GAA ATT GCT GAG CAC GCC TGC CTC TCT GAT CCC TGT CAC AAC AGA GGC 1417 
Glu He Ala Glu His Ala Cys Leu Ser Asp Pro Cys His Asn Arg Gly 
335 340 345 

AGC TGT AAG GAG ACC TCC CTG GCC TTT GAG TGT GAG TGT TCC CCA GGC 1465 
Ser Cys Lys Glu Thr Ser Leu Gly Phe Glu Cys Glu Cys Ser Pro Gly 
350 355 360 365 

TGG ACC CGC CCC ACA TGC TCT ACA AAC ATT GAT GAC TGT TCT CCT AAT 1513 
Trp Thr Gly Pro Thr Cys Ser Thr Asn He Asp Aep Cys Ser Pro Asn 
370 375 380 

AAC TGT TCC CAC GGG GGC ACC TGC CAG GAC CTG GTT AAC GGA TTT AAG 1S61 
Asn Cys Ser His Gly Gly Thr Cys Gin Asp Leu Val Asn Gly Phe Lys 
385 390 395 

TGT GTG TCC CCC CCA CAG TGG ACT GGG AAA ACG TGC CAG TTA GAT GCA 1609 
Cys Val Cye Pro Pro Gin Trp Thr Gly Lys Thr Cys Gin Leu Asp Ala 
400 405 410 

AAT CAA TGT CAG GCC AAA CCT TGT GTA AAC GCC AAA TCC TGT AAG AAT 1657 
Asn Glu Cys Glu Ala Lys Pro Cys Val Aen Ala Lys Ser Cys Lys Asn 
415 420 425 

CTC ATT GCC ACC TAC TAC TGC GAC TGT CTT CCC GGC TGG ATG GGT CAC 1705 
Leu He Ala Ser Tyr Tyr Cys Asp Cys Leu Pro Gly Trp Met Gly Gin 
430 435 440 445 

AAT TGT GAC ATA AAT ATT AAT GAC TGC CTT GGC CAG TGT CAG AAT GAC 1753 
Asn Cys Asp He Aen He Aen Aep Cys Leu Gly Gin Cys Gin Asn Aep 
450 455 460 

GCC TCC TGT CGG GAT TTG GTT AAT GGT TAT CCC TGT ATC TCT CCA CCT 1801 
Ala Ser Cys Arg Asp Leu Val Asn Gly Tyr Arg Cys He Cys Pro Pro 
465 470 475 
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GGC TAT GCA GGC GAT CAC TGT GAG AG A GAC ATC GAT GAA TGT GCC AGC 
Cly Tyr Ala Gly Asp Hie Cys Glu Arg Asp lie Asp C^ lyl A^a Ser 
* ou 485 490 

AAC CCC TGT TTG AAT GGG GGT CAC TGT CAG AAT GAA ATC AAC AGA TTC 
Asn Pro Cys leu Asn Gly Gly His Cys Gin Asn Glu Ue £n £rg Phe 

500 505 

Gin ^1 f TG S CC ACT °° T TTC TCT GGA ** C CTC TGT CAG CTG GAC 

sio eu ys I?c Gly phe Ser Gly Asn Leu °y 6 °i 

n Leu Asp 
520 525 

A Tf TGT GAC CCT ^ CCC TGC CAG AAC GGT GCC CAG TGC TAG 

He Asp Tyr Cys Glu Pro Asn Pro Cys Gin Asn Gly Ala Cys lyr 

535 540 

£n £2 A?° t AC I AT T 3 C TCC AAG TGC CCC G ^G GAC TAT GAC GGC 

Asn Arg Ala Ser Asp Tyr Phe Cys Lye Cys Pro Glu Asp Tyr Glu Gly 
s<5 550 555 J 

AAG AAC TGC TCA CAC CTG AAA GAC CAC TGC CGC ACG ACC CCC TGT GAA 
Lye Aen Cye Ser His Leu Lys Aep Hie Cye Arg Thr Thr Pro Cys g™ 

565 570 

VaT ill £ GC I° C If* GTG CCC ATG GCT TCC AAC GAC ACA CCT GAA 

val lie Aep Ser Cye Thr Val Ala Met Ala Ser Asn Asp Thr Pro Glu 
3/3 580 585 

G^v 51? *™ t" ?T T I CC TCC AAC CTC TGT «» CCT CAC GGG AAG TGC 
Gly val Arg Tyr lie Ser Ser Aen Val Cye Gly Pro His Gly Lye Cys 

595 " 600 60S 

AAG ACT CAG TCG GGA GGC AAA TTC ACC TGT GAC TGT AAC AAA GGC TTC 
Lye ser Gin Ser Gly Gly Lys Phe Thr Cye Asp Cyl £n Lys tly "e 

610 615 520 

JSS ?? A 5k A l AC TGC CAT GAA ^ ATT ** T G ™ TCT GAG AGC AAC CCT 
Thr Gly Thr Tyr Cys His Glu Asn He Aen Asp Cys Glu Ser Asn Pro 

625 630 635 

TGT AGA AAC GGT GGC ACT TGC ATC GAT GGT GTC AAC TCC TAC AAG TGC 
Cys Arg Asn Gly Gly Thr Cys lie Asp Gly Val Asn Ser lyr £s Cys 

645 650 

He sll a« C r GC I™ S AC GGC ° CC TAC TCT GAA ACC AAT ATT AAT 

P Gly Trp Clu Gl y Ala Glu ™* Asn lie Asn 

655 660 665 



GAC TCC AGC CAG AAC CCC TCC CAC AAT GGG GGC ACG TGT CGC GAC CTG 
Asp Cys Ser Gin Asn Pro Cys His Asn Gly Gly Thr Cys Arg Asp S5 
670 675 680 ^ 685 



1849 



1897 



1945 



1993 



2041 



2089 



2137 



2185 



2233 



2281 



2329 



2377 



2425 



2473 



2521 



GTC AAT GAC TTC TAC TGT GAC TGT AAA AAT GGG TGG AAA GGA AAC ACC 
Val Asn Asp Phe Tyr Cys Asp Cys Lys Asn Gly Trp Lys Gly Lys Thr 
690 695 700 

TGC CAC TCA CGT GAC ACT CAG TGT GAT GAG GCC ACG TGC AAC AAC GGT 
cys His Ser Arg Asp Ser Gin Cys Aep Glu Ala Thr Cys Asn Asn Gly 
705 710 715 1 

GGC ACC TGC TAT GAT GAG GGG GAT GCT TTT AAG TGC ATC TGT CCT GGC 2 569 

Gly Thr Cys Tyr Asp Glu Cly Asp Ala Phe Lys Cys Met Cys Pro Cly 
720 725 730 

£?S J™ GAA °? A ACA ACC TGT AAC ATA GCC CGA AAC ACT AGC TGC CTG 2 617 

Gly Trp Clu Cly Thr Thr Cys Asn lie Ala Arg Asn Ser Ser Cys Leu 
7 -35 740 745 



-91- 



WO 96/27610 



PCI7US96/03172 



s as s ss k a; a a s a a a as a a: e 
s: s 5J a ss ssss a a sss a a a a 

775 780 

K & S£ g SS B B E SSS B 55 SS B B B B 

790 7 g 5 

s a; 5 as as 5 a g ss a b s a a a K 
5 b as 5J a: s as a a: a: ss a b s a « 
s s a s ;s s: a a; a a k a b s b a 
g si a sj ss a a a a s a a 

850 855 86 J y 

s ss ;ts s a a a; a a s a? a ss a: s ss 

875 

CAT CAC TGT AAT ACC TGC CAC TCC CTG AAT GGA COG ATC GCC TGC Tr» 
Asp Asp Cy B Asn Thr Cys Gin Cys U U Asn Gly Arg lie Ma Cys ™ 

885 ago 

AAG GTC TGC TGT GGC CCT CGA CCT TGC CTG CTC CAC AAA ccr r»r 

VaX Trp Cys Gly Pro Arg Pro Cye Leu ™ ™ CAC ACC 

' uu 905 

ss ss a a a a ss ss ;is sss a a a as ss ss 

ais 920 9J5 

Phf SI? S* C f° C TGC ACT GGT CTC CGC 0AC TOT COO TCT TCC AGT CTC 
Phe val hi. Pro Cye Thr Gly Val Gly olu Cy B Arg Ser Ser s" Leu 

930 935 940 

CAC CCC GTG AAG ACA AAG TGC ACC TCT GAC TCC TAT TAC CAC CAT **n 
Gin Pro Val Lya Thr Lye Cys Thr Ser Asp Ser lyl lyr Oln "p £n 

945 950 955 

TGT GCG AAC ATC ACA TTT ACC TTT AAC AAG GAG ATG ATG TCA CCA GGT 
Cye Ala Asn He Thr Phe Thr Phe Asn Lye clu nit Met ler PrS £y 

CTT ACT ACC GAG CAC ATT TGC AGT GAA TTG AGG AAT TTG AAT ATT TTr 
Leu Thr Thr Glu His lie Cys Ser Clu Leu Arg £n Leu En 111 Uu 
" 3 980 985 

AAG AAT GTT TCC CCT GAA TAT TCA ATC TAC ATC GCT TGC GAG CCT TCC 
Lye Asn Val Ser Ala Glu Tyr Ser lie Tyr lie Ala Cys tlu Pro Ser 

995 1000 J005 

Pr« If* i*° GAA ATA CAT GTG GCC ATT TC T GCT GAA GAT ATA 

Pro Ser Ala Asn Asn Clu lie His Val Ala lie Ser Ala Glu Asp lie 

1010 1015 ao20 



2665 



2713 



2761 



2809 



2857 



2905 



2953 



3001 



3049 



3097 



3145 



3193 



3241 



3289 



3337 



3385 



3433 
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CGG GAT GAT GGG AAC CCG ATC AAG GAA ATC ACT GAC AAA ATA ATC GAT 
Arg Asp Asp Gly Asn Pro lie Lys Glu lie Thr Asp Lys He He Asp 
1025 1030 1035 

CTT GTT ACT AAA CGT GAT GGA AAC AGC TCG CTG ATT GCT GCC GTT GAA 
Leu Val Thr LyB Arg Asp Gly Asn Ser Ser Leu He Ala Ala Val Glu 
1040 1045 1050 

GAA GTA AGA GTT CAG AGG CGG CCT CTG AAG AAC AGA ACA GAT TTC CTT 
Glu Val Arg Val Gin Arg Arg Pro Leu Lys Asn Arg Thr Asp Phe Leu 
1055 1060 1065 

GTT CCC TTG CTG AGC TCT GTC TTA ACT GTG GCT TGG ATC TGT TGC TTG 
Val Pro Leu Leu Ser Ser Val Leu Thr Val Ala Trp He Cys Cys Leu 
1070 1075 1080 1085 

GTG ACG GCC TTC TAC TGG TGC CTG CGG AAG CGG CGG AAG CCG GGC AGC 
Val Thr Ala Phe Tyr Trp Cys Leu Arg Lys Arg Arg Lys Pro Gly Ser 
1090 1095 11QO 

CAC ACA CAC TCA GCC TCT GAG GAC AAC ACC ACC AAC AAC GTG CGG GAG 
Hie Thr His Ser Ala Ser Glu Asp Asn Thr Thr Asn Asn Val Arg Glu 
1105 1110 H15 

CAG CTG AAC CAG ATC AAA AAC CCC ATT GAG AAA CAT GGG GCC AAC ACG 
Gin Leu Asn Gin He Lys Asn Pro He Glu Lys His Gly Ala Asn Thr 
1120 H25 H30 

GTC CCC ATC AAG GAT TAC GAG AAC AAG AAC TCC AAA ATG TCT AAA ATA 
Val Pro He Lys Asp Tyr Glu Asn Lys Asn Ser Lys Met Ser Lys He 
1135 H40 H45 

AGG ACA CAC AAT TCT GAA GTA GAA GAG GAC GAC ATG GAC AAA CAC CAG 
Arg Thr His Asn Ser Glu Val Glu Glu Asp Asp Met Asp Lys His Gin 
1150 1155 H60 H65 

CAC AAA GCC CGG TTT GCC AAG CAG CCG GCG TAC ACG CTG GTA GAC AGA 
Gin Lys Ala Arg Phe Ala Lys Gin Pro Ala Tyr Thr Leu Val Asp Arg 
1170 1175 Httu 

GAA GAG AAG CCC CCC AAC GCC ACG CCG ACA AAA CAC CCA AAC TGG ACA 
Glu Glu Lys Pro Pro Asn Gly Thr Pro Thr Lys His Pro Asn Trp Thr 
1185 H90 II 95 

AAC AAA CAG GAC AAC AGA GAC TTG GAA ACT GCC CAG AGC TTA AAC CGA 
Asn Lys Gin Asp Asn Arg Asp Leu Glu Ser Ala Gin Ser Leu Asn Arg 
1200 1205 1210 

ATG GAG TAC ATC GTA TAG CAGACCGCGG GCACTGCCGC CGCTAGGTAG 
Met Glu Tyr He Val 
1215 

AG TCTG AGGG CTTGTACTTC TTT AAA CTG T CGTGTCATAC TCGAGTCTGA GGCCGTTGCT 
GACTTAGAAT CCCTGTGTTA ATTTAGTTTG ACAAGCTGGC TTACACTGGC AATGGTAGTT 
CTG TGG TTGG CTCCGAAATC GAGTCGCGCA TCTCACAGCT ATG C AAAAAG CTAGTCAACA 
CTACCCCTGG TTGTGTGTCC CCTTGCAGCC GACACGGTCT CGGATCAGGC TCCCAGGAGC 
TGCCCAGCCC CCTGGTACTT TGAGCTCCCA CTTCTGCCAG ATGTCTAATG G TG ATG C AG T 
CTTAGATCAT AGTTTTATTT ATATTTATTG ACTCTTGAGT TCTTTTTGTA TATTGGTTTT 
ATGATGACGT ACAAGTAGTT CTGTATTTGA AAGTGCCTTT GCAGCTCAGA ACCACAGCAA 
CG AT CAC AAA TCACTTTATT ATTTATTTTT TTTAATTGTA TTTTTGTTGT TGGGGGAGGG 



3481 

3529 

3577 

3625 

3673 

3721 

3769 

3817 

3865 

3913 

3961 

4009 

4057 

4117 
4177 
4237 
4297 
4357 
4417 
4477 
4537 
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GAGACTTTGA 


TGTCAGCAGT 


TGCTGGTAAA 


ATGAAGAATT 


TAAAGAAAAA 


ATGTCCAAAA 


4597 


GTAGAACTTT 


GTATAGTTAT 


GTAAATAATT 


CTTTTTTATT 


AATCACTGTG 


TATATTTGAT 


4657 


TTATTAACTT 


AATAATCAAG 


AGCCTTAAAA 


CATCATTCCT 


TTTTATTTAT 


ATGTATGTGT 


4717 


TTAGAATTGA 


AGGTTT TTG A 


TAGCATTGTA 


AGCGTATGGC 


TTTATTTTTT 


TGAACTCTTC 


4777 


TCATTACTTG 


TTGCCTATAA 


GCCAAAAAGG 


AAAGGGTGTT 


TTGAAAATAG 


TTTATTTTAA 


4837 


AACAATAGGA 


TGGGCTACAC 


GTACATAGGT 


AAATAATAGC 


ACCGTACTGG 


TTATGATGAT 


4897 


GAAAATAACT 


GGAAACTTGA 


AAGCTTGTGG 


TAATCGCAGA 


TAAAGATGGT 


TCACCTGGGA 


4957 


AATTAAAACT 


TGAATGGTTG 


TACAGAAAAG 


CACAGAGTGG 


AATGCACATC 


AATGACAGTA 


5017 


AGGGAGTTAG 


TTCTAGGAAC 


AGCTCCTGAA 


CAGTAAGATT 


CCCGCAATAG 


TCTCCGCCTC 


5077 


GTTCGTCTAT 


GGTATGCATC 


CCATTCATTT 


TCTTCTTCTG 


ATTATTGTCA 


TCTTTCCCTT 


5137 


TGCCAAATGG 


GCAGTTATTG 


TTTCAGGGAG 


AGAAGCTGCT 


CATTGGCCAA 


TCATTCTGGT 


5197 


GTGCAGTGCT 


CCATCGGATT 


CTACATCTCC 


AAC AAGG CAT 


GTCTGGATGA 


TGCAATGTCT 


5257 


GTCTGACCCC 


CGGAATTCCG 


TGCAGAGACA 


ACATTCTAGA 


CAGATATACA 


CTTTTTATTA 


5317 


TTAACAAACT 


TTGG CCACAA 


CCTTTGATGT 


ATAAATTGCC 


GGATTTCCCC 


AGTCCTTTCA 


5377 


TTGTGGCTTT 


GGACAGGAGC 


AGGCTCACTT 


GTCTGCTTCA 


GGCTGCCTTT 


CTCTTGGGTT 


5437 


GCACCTCAGT 


T CTT ACTT AT 


TTATTTATTT 


TGAGTGGAGC 


ATAGGGGCCT 


CTTCCAAAAT 


5497 


GGG TAG AG CT 


CAGGGGCTTT 


CTTATTGAAA 


TGGTCACATG 


ATAAAAACGG 


GCTGAAAAAG 


5557 


GAGAGTTCCA 


GG AG AAAAGC 


CCAGAAAAGG 


CCCCTCCTCA 


GAAGACAGCC 


TTTAAGCCTC 


5617 


TTGCTTACTG 


AAGGAAGCCC 


CACCTTCTAG 


CACTGAGGCC 


GGGTCTGATC 


TTCCAGAGGA 


5677 


GTTGGAGGAG 


TCCATGAGAA 


TGGCCACCAT 


TCTTGCTTGC 


TGCTGCTGAT 


GTTGCAGTTT 


5737 


TGAGAGAACA 


GCGGGATCCT 


TGTTGTCCTC 


TAGAGACTTG 


AGTCTGTCAC 


TGACATTTTT 


5797 


TCAG TTC CTT 


TGCTCATAGA 


CCATACGAGG 


AATTAGTGAT 


GTGTCAGTTG 


AGAGTTCACA 


5857 


ATCTCATTGT 


TCATTTAATT 


CACTTTAAAG 


TTGTCAATTT 


CTGTGTGAGT 


AACCTGTAAA 


S917 


AGACACCTTT 


CCAGAAGAGT 


TTTGCCGTCT 


GTTTGAAAAA 


AAAATCTTTA 


TAAACTTTCC 


5977 


TAAGTATCTG 


GATTTGGATT 


CCTTATTTGG 


AGAGAAAATG 


TACCCTGTCT 


CCACCAAAAA 


6037 


TACAAAAATT 


AGCCAGGCTT 


GGTGGTGCAC 


ACCGGTAATC 


CCAGCAACTC 


TGGAGACTAA 


6097 


GGCAGGAAGA 


ATCGCTTGAC 


CCAGGAGGGT 


CGAGGCTACA 


ATGAGTTGAA 


ACCGCGCCAC 


6157 


TGCACTCCAG 


CCTGGGCGAC 


AGTGCGAGGC 


CCTGTCTCAA 


AAATAAAATA 


AAATAAATAA 


6217 


ATAAATTAGC 


CAGATACTGT 


GTGCACGCCT 


GCAGTCCCAG 


CTATTCTGGA 


AGCTGAGGTG 


6277 


GGAAGATGGT 


TAAGCCTGAG 


AGGACAAAGC 


TGCAGTGAGT 


CATGTTTGCA 


TCACTGCACT 


6J37 


CCAGCCTGGG 


TGACAGAGCA 


AGACCCTGTC 


TAAAAAACAA 


AAACAGGCCG 


GGTGTGGTGG 


6397 


CTCATGCCTC 


CCATCCCAGT 


GCTTTGGGAG 


GCAGAGGTTG 


GCATAATCCC 


AGCGCTCTGG 


6457 


GAATTCC 












6464 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1219 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Arg Ser Pro Arg Thr Arg Gly Arg Ser Cly Arg Pro Leu Ser Leu 
15 10 15 

Leu Leu Ala Leu Leu Cye Ala Leu Arg Ala Lye Val Cye Gly Ala Ser 
20 25 30 

Gly Gin Phe Glu Leu Glu He Leu Ser Met Gin Asn Val Aen Gly Clu 
35 40 45 

Leu Gin Aen Gly Aen Cys Cye Gly Gly Ala Arg Asn Pro Gly Asp Arg 
50 55 60 

Lys Cys Thr Arg Asp Glu Cys Asp Thr Tyr Phe Lys Val Cys Leu Lys 
65 70 75 80 

Glu Tyr Gin Ser Arg Val Thr Ala Gly Gly Pro Cys Ser Phe Gly Ser 
85 90 95 

Gly Ser Thr Pro Val He Gly Gly Asn Thr Phe Asn Leu Lys Ala Ser 
100 105 HO 

Arg Gly Asn Asp Pro ABn Arg He Val Leu Pro Phe Ser Phe Ala Trp 
115 120 125 

Pro Arg Ser Tyr Thr Leu Leu Val Glu Ala Trp Asp Ser Ser Asn Asp 
130 135 140 

Thr Val Gin Pro Asp Ser He He Glu Lys Ala Ser HiB Ser Gly Met 
145 150 155 160 

He Asn Pro Ser Arg Gin Trp Gin Thr Leu Lys Gin Asn Thr Gly Val 
165 170 175 

Ala Hie Phe Glu Tyr Gin He Arg Val Thr Cys Asp Asp Tyr Tyr Tyr 
180 185 190 

Gly Phe Gly Cys Asn Lys Phe Cys Arg Pro Arg Asp Asp Phe Phe Gly 
195 200 205 

Hia Tyr Ala Cys Asp Gin Aen Gly Asn Lys Thr Cys Met Glu Gly Trp 
210 215 220 

Met Gly Pro Glu Cys Asn Arg Ala He Cys Arg Gin Gly Cys Ser Pro 
225 230 235 240 

Lys His Gly Ser Cys Lys Leu Pro Cly Asp Cys Arg Cys Gin Tyr Gly 
245 250 255 

Trp Gin Gly Leu tyr Cys Asp Lys Cye He Pro His Pro Gly Cys Val 
260 265 270 

His Gly He Cys Asn Glu Pro Trp Gin Cys Leu Cys Glu Thr Asn Trp 
275 280 285 

Gly Gly Gin Leu Cys Asp Lys Asp Leu Asn Tyr Cys Gly Thr His Gin 
290 295 300 
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Pro Cyc Leu Aen Gly Gly Thr Cye Ser Asn Thr Gly Pro Asp Lys Tyr 
305 310 315 320 

Gin Cys Ser CyB Pro Glu Gly Tyr Ser Gly Pro Asn Cys Glu lie Ala 
325 330 335 

Glu Hie Ala Cye Leu Ser Aep Pro Cye Hie Asn Arg Gly Ser Cye Lye 
340 345 350 

Glu Thr Ser Leu Gly Phe Glu Cye Glu Cye Ser Pro Gly Trp Thr Gly 
355 360 365 

Pro Thr Cye Ser Thr Aen lie Aep Aep Cye Ser Pro Aen Aen Cye Ser 
370 375 380 

Hie Gly Gly Thr Cye Gin Aep Leu Val Aen Gly Phe Lye Cye Val Cye 
385 390 395 400 

Pro Pro Gin Trp Thr Gly Lye Thr Cye Gin Leu Asp Ala Aen Glu Cys 
405 410 415 

Glu Ala Lye Pro Cye Val Asn Ala Lye Ser Cys Lys Asn Leu lie Ala 
420 425 430 

Ser Tyr Tyr Cys Asp Cye Leu Pro Gly Trp Met Gly Gin Asn Cys Asp 
435 440 445 

lie Asn lie Asn Aep Cye Leu Gly Gin Cys Gin Aen Asp Ala Ser Cys 
450 455 460 

Arg Asp Leu Val Asn Gly Tyr Arg Cye lie Cye Pro Pro Gly Tyr Ala 
465 470 475 480 

Gly Asp His Cys Glu Arg Asp lie Asp Glu Cys Ala Ser Asn Pro Cye 
485 490 495 

Leu Asn Gly Gly His Cys Gin Asn Glu lie Asn Arg Phe Gin Cys Leu 
500 505 510 

Cys Pro Thr Gly Phe Ser Gly Asn Leu Cys Gin Leu Asp lie Asp Tyr 
515 520 525 

Cys Glu Pro Aen Pro Cys Gin Asn Gly Ala Gin Cys Tyr Asn Arg Ala 
530 535 540 

Ser Aep Tyr Phe Cye Lys Cye Pro Glu Aep Tyr Glu Gly Lye Aen CyB 
545 550 555 560 

Ser His Leu Lys Asp His Cye Arg Thr Thr Pro Cye Glu Val lie Aep 
565 570 575 

Ser Cys Thr Val Ala Met Ala Ser Aen Asp Thr Pro Glu Gly Val Arg 

580 585 590 

Tyr He Ser Ser Asn Val Cye Gly Pro Hie Gly Lye Cye Lys Ser Gin 
595 600 605 

Ser Gly Gly Lye Phe Thr Cys Asp Cys Aen Lys Gly Phe Thr Gly Thr 
610 615 620 

Tyr Cye His Glu Aen He Asn Aep Cys Glu Ser Asn Pro Cys Arg Aen 
625 630 635 640 

Gly Gly Thr Cye He Aep Gly Val Asn Ser Tyr Lye Cys He Cye Ser 
645 650 655 

Aep Gly Trp Glu Gly Ala Tyr Cys Glu Thr Asn lie Asn ABp Cys Ser 
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660 665 670 

Gin Asn Pro Cys His Asn Gly Gly Thr Cys Arg Asp Leu Val Asn Asp 
675 680 665 

Phe Tyr Cys Asp Cys Lys Asn Gly Trp Lye Gly Lys Thr Cys His Ser 
690 695 700 

Arg Asp Ser Gin Cys Asp Glu Ala Thr Cys Asn Asn Gly Gly Thr Cys 
705 710 715 720 

Tyr Asp Glu Gly Asp Ala Phe Lys Cys Met Cys Pro Gly Gly Trp Glu 
725 730 735 

Gly Thr Thr Cys Asn lie Ala Arg Asn Ser Ser Cys Leu Pro Asn Pro 
740 745 750 

Cys His Asn Gly Gly Thr Cys Val Val Asn Gly Glu Ser Phe Thr Cys 
755 760 765 

Val Cys Lys Glu Gly Trp Glu Gly Pro lie CyB Ala Gin Asn Thr Asn 
770 775 780 

Asp Cys Ser Pro His Pro Cys Tyr Asn Ser Gly Thr Cys Val Asp Gly 
785 790 795 800 

Asp Asn Trp Tyr Arg Cys Glu Cys Ala Pro Gly Phe Ala Gly Pro Asp 
805 810 815 

Cys Arg lie Asn lie Asn Glu Cys Gin Ser Ser Pro Cys Ala Phe Gly 
820 825 830 

Ala Thr Cys Val Asp Glu lie Asn Gly Tyr Arg Cys Val Cys Pro Pro 
835 840 845 

Gly His Ser Gly Ala Lye Cys Gin Glu Val Ser Gly Arg Pro Cys lie 
850 855 860 

Thr Met Gly Ser Val lie Pro Asp Gly Ala Lys Trp Asp Asp Asp Cys 
865 870 875 880 

Asn Thr Cys Gin Cys Leu Asn Gly Arg lie Ala Cys Ser Lys Val Trp 
885 890 895 

Cys Gly Pro Arg Pro Cys Leu Leu His Lys Gly His Ser Glu Cys Pro 
900 905 910 

Ser Gly Gin Ser Cys lie Pro lie Leu Asp Asp Gin Cys Phe Val His 
915 920 925 

Pro Cys Thr Gly Val Gly Glu Cys Arg Ser Ser Ser Leu Gin Pro Val 
930 935 940 

Lys Thr Lys Cys Thr Ser Asp Ser Tyr Tyr Gin Asp Asn Cys Ala Asn 
945 950 955 960 

lie Thr Phe Thr Phe Asn Lye Glu Met Met Ser Pro Gly Leu Thr Thr 
965 970 975 

Glu His lie Cys Ser Glu Leu Arg ABn Leu Asn lie Leu Lys Asn Val 
980 985 990 

Ser Ala Glu Tyr Ser He Tyr He Ala Cys Glu Pro Ser Pro Ser Ala 
995 1000 1005 

Asn Asn Glu He His Val Ala He Ser Ala Glu Asp He Arg Asp Asp 
1010 1015 1020 
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Gly Aen Pro lie Lye Glu lie Thr Asp Lys He He Asp Leu Val Thr 
1025 1030 1035 1040 

Lye Arg Asp Gly Asn Ser Ser Leu lie Ala Ala Val Glu Glu Val Aro 
1045 1050 1055 

Val Gin Arg Arg Pro Leu Lys Asn Arg Thr Aep Phe Leu Val Pro Leu 
1060 1065 1070 

Leu Ser Ser Val Leu Thr Val Ala Trp He Cys Cys Leu Val Thr Ala 
1075 1080 1085 

^ ToIo TrP ° yS ^ Ar9 VXL Ar9 Arg LyS Pro Ser Hie Thr His 

1U * U 1095 1100 

Ser Ala Ser Glu Asp Asn Thr Thr Asn Asn Val Arg Glu Gin Leu Asn 
1105 1110 H15 1120 

Gin lie Lys Asn Pro lie Glu Lys Hie Gly Ala Aen Thr Val Pro lie 
1125 H30 i 135 

Lys Asp Tyr Glu Aen Lys Asn Ser Lys Met Ser Lys lie Arg Thr His 
1140 1145 ia | 0 

Aen Ser Glu Val Glu Glu Aep Aep Met Asp Lye His Gin Gin Lys Ala 
1155 1160 1165 

Arg f?* Ala LyB Gln Pro Ala Th * Leu Val Asp Arg Glu Glu Lys 

1170 H75 HBO 

Pro Pro Asn Gly Thr Pro Thr Lys His Pro Asn Trp Thr Asn Lys Gin 
1185 II 90 H95 1200 

Asp Asn Arg Asp Leu Glu Ser Ala Gin Ser Leu Asn Arg Met Glu Tyr 
1205 1210 1215 



He Val 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4483 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D> TOPOLOGY ; linear 

<ii) MOLECULE TYPE: DNA 



< ix ) FEATURE : 

(A) NAME /KEY : CDS 

(B) LOCATION: 332.-4483 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GGCCGGGGCC GGGCGGGCGG GTCGCGGGGG CAATGCGGGC GCAGGGCCGG GGGCGCCTTC 
CCCGGCGGCT GCTGCTGCTG CTGGCGCTCT GGGTGCAGGC GGCGCGGCCC ATGGGCTATT 
TCGAGCTGCA GCTCAGCGCG CTGCGGAACG TGAACGGGGA GCTGCTGAGC GGCGCCTGCT 
GTCACGGCGA CGGCCGGACA ACGCGCGCGG GGGGCTGCGC CCACGACGAG TGCGACACCG 
CTCCTTTACC CTCATCGTGG AGGCCTGGGA CTGGGACAAC GATACCACCC CGAATGAGGA 
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GCTGCTGATC GAGCGAGTGT CGCATGCCGG C ATG ATC AAC CCG GAG GAC CGC 3 52 

Met lie Asn Pro Glu Asp Arg 
1 5 

TGG AAG AGC CTG CAC TTC AGC GGC CAC GTG GCG CAC CTG GAG CTG CAG 400 

Trp Lye Ser Leu Hie Phe Ser Gly Hie Val Ala HiB Leu Glu Leu Gin 
10 15 20 

ATC CGC GTG CGC TGC GAC GAG AAC TAC TAC AGC GCC ACT TGC AAC AAG 448 

lie Arg Val Arg Cys Aep Glu Aen Tyr Tyr Ser Ala Thr Cye Asn Lye 
25 30 35 

TTC TGC CGG CCC CGC AAT GAC TTT TTC GGC CAC TAC ACC TGC GAC CAG 496 

Phe Cys Arg Pro Arg Asn Asp Phe Phe Gly Hie Tyr Thr Cys Asp Gin 
40 45 SO 55 

TAC GGC AAC AAG GCC TGC ATG GAC GGC TGG ATG GGC AAG GAG TGC AAG 544 

Tyr Gly Asn Lys Ala Cye Met Asp Gly Trp Met Gly Lye Glu Cys Lye 

60 . 65 70 

GAA GCT GTG TGT AAA CAA GGG TGT AAT TTG CTC CAC GCG GCA TGC ACC 592 

Glu Ala Val Cys Lys Cln Gly Cys Asn Leu Leu Hie Gly Gly Cys Thr 

75 80 85 

GTG CCT GGG GAG TGC AGG TGC AGC TAC GGC TGG CAA GGG AGG TTC TGC 640 

Val Pro Gly Glu Cye Arg Cys Ser Tyr Gly Trp Gin Gly Arg Phe Cys 
90 95 100 

GAT GAG TGT CTC CCC TAC CCC GCC TGC GTG CAT GGC AGT TGT GTG GAG 688 

Aep Glu Cys Val Pro Tyr Pro Gly Cys Val His Gly Ser Cys Val Glu 
105 110 115 

CCC TGG CAG TGC AAC TGT GAG ACC AAC TGG GGC GGC CTG CTC TGT GAC 736 

Pro Trp Gin Cys Asn Cys Glu Thr Asn Trp Gly Gly Leu Leu Cys Asp 
120 125 130 135 

AAA CAC CTG AAC TAC TGT GGC AGC CAC CAC CCC TGC ACC AAC GGA GGC 784 

Lys Asp Leu Asn Tyr Cys Gly Ser Hie Hie Pro Cye Thr Aen Gly Gly 

140 145 150 

ACG TGC ATC AAC GCC GAG CCT GAC CAG TAC CGC TGC ACC TGC CCT GAC 832 

Thr Cye lie Aen Ala Glu Pro ABp Gin Tyr Arg Cys Thr Cys Pro Asp 

155 160 165 

GGC TAC TCG GGC AGG AAC TGT GAG AAG GCT GAG CAC GCC TGC ACC TCC 880 

Gly Tyr Ser Gly Arg Asn Cye Glu Lye Ala Glu Hie Ala Cys Thr Ser 
170 175 180 

AAC CCG TGT GCC AAC GGG GGC TCT TGC CAT GAG GTG CCG TCC GGC TTC 928 

Aen Pro Cye Ala Aen Gly Gly Ser Cys Hie Glu Val Pro Ser Gly Phe 
185 190 195 

GAA TGC CAC TGC CCA TCG GGC TGG AGC GGG CCC ACC TGT GCC CTT GAC 97 6 

Glu Cye Hie Cye Pro Ser Gly Trp Ser Gly Pro Thr Cys Ala Leu Asp 
200 205 210 215 

ATC GAT GAG TGT GCT TCG AAC CCG TGT GCG GCC GGT GGC ACC TGT GTG 1024 

lie Asp Glu Cys Ala Ser Asn Pro Cye Ala Ala Gly Gly Thr Cys Val 

220 225 230 

GAC CAG GTG GAC GGC TTT GAG TGC ATC TGC CCC GAG CAG TGG GTG GGG 1072 

Asp Gin Val Asp Gly Phe Glu Cys He Cye Pro Glu Gin Trp Val Gly 

235 240 245 

GCC ACC TGC CAG CTG GAC GCC AAT GAG TGT GAA GGG AAG CCA TGC CTT 1120 

Ala Thr Cys Gin Leu Asp Ala Asn Glu Cye Glu Gly Lys Pro Cys Leu 
250 255 260 
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AAT CCC AAC GAC TGC CTT CCC GAT CCC TGC CAC AGC CGC GGC CGC TGC 
Aen Pro Aen Asp Cys Leu Pro Asp Pro Cys His Ser Arg Gly Arg Cys 
505 510 515 
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AAC GCT TTT TCT TGC AAA AAC CTG ATT GGC GGC TAT TAC TGT GAT TGC 116P 
Asn Ala Phe Ser Cys Lys Asn Leu lie Gly Gly Tyr Tyr Cys Asp Cye 
265 270 275 

ATC CCG GGC TGG AAG GGC ATC AAC TGC CAT ATC AAC GTC AAC GAC TGT 1216 
lie Pro Gly Trp Lys Gly He Asn Cys His He Asn Val Asn Asp Cye 
280 285 290 295 

CGC CGG CAG TCT CAG CAT GGG GGC ACC TGC AAG GAC CTG GTG AAC GGG 1264 
Arg Gly Gin Cys Gin His Gly Gly Thr Cys Lye Asp Leu Val Aen Gly 
300 305 3ao 

TAC CAG TGT GTG TGC CCA CGG GGC TTC GGA GGC CCG CAT TGC GAG CTG 1312 
Tyr Gin Cys Val Cys Pro Arg Gly Phe Gly Gly Arg His Cys Clu Leu 
315 320 325 

GAA CCA CAC AAG TGT GCC AGC AGC CCC TGC CAC AGC GGC GGC CTC TGC 1360 
Glu Arg Aep Lye Cys Ala Ser Ser Pro Cys His Ser Gly Gly Leu Cys 
330 335 340 

GAG GAC CTG GCC GAC GGC TTC CAC TGC CAC TGC CCC CAG GGC TTC TCC 
Glu Asp Leu Ala Aep Gly Phe His Cys His Cye Pro Gin Gly Phe Ser 
345 350 355 

GGC CCT CTC TGT GAG GTG GAT GTC GAC CTT TGT GAG CCA AGC CCC TGC 14 56 

Gly Pro Leu Cys Glu Val Asp Val Asp Leu Cys Glu Pro Ser Pro Cys 
360 365 370 375 

CGG AAC GGC GCT CGC TGC TAT AAC CTG GAG GCT GAC TAT TAC TCC GCC 1504 
Arg Asn Gly Ala Arg Cys Tyr Asn Leu Glu Gly Asp Tyr Tyr Cys Ala 
380 385 390 

TGC CCT CAT GAC TTT GCT GGC AAG AAC TGC TCC GTG CCC CGC GAG CCG 155 2 

Cys Pro Asp Asp Phe Gly Gly Lys Asn Cys Ser Val Pro Arg Glu Pro 
395 400 405 

TGC CCT GGC GGG GCC TGC AGA GTG ATC GAT GGC TGC GGG TCA GAC GCG 1600 
Cys Pro Gly Gly Ala Cys Arg Val He Asp Gly Cys Gly Ser Asp Ala 
410 415 4 2o 

GGG CCT GGG ATG CCT GCC ACA CCA GCC TCC GGC GTG TGT GGC CCC CAT 1648 
Gly Pro Gly Met Pro Gly Thr Ala Ala Ser Gly Val Cys Gly Pro His 
425 430 435 

GGA CGC TGC GTC AGC CAG CCA GGG GCC AAC TTT TCC TGC ATC TGT GAC 1696 
Gly Arg Cys Val Ser Gin Pro Gly Gly Aen Phe Ser Cye He Cys Asd 
4 «0 445 450 455 

AGT GGC TTT ACT GGC ACC TAC TGC CAT GAG AAC ATT GAC GAC TGC CTG 1744 
Ser Gly Phe Thr Gly Thr Tyr Cys His Glu Asn He Asp Aep Cye Leu 
460 465 470 

GGC CAC CCC TGC CGC AAT GGG GGC ACA TGC ATC GAT GAG GTG GAC GCC 1792 
Gly Gin Pro Cye Arg Asn Gly Gly Thr Cys He Asp Glu Val Aep Ala 
475 480 485 

TTC CGC TGC TTC TGC CCC AGC GGT TGG GAG GGC GAG CTC TGC GAC ACC 1840 
Phe Arg Cye Phe Cye Pro Ser Gly Trp Glu Gly Glu Leu Cys Asp Thr 
490 495 500 



1886 



TAC GAC CTG GTC AAT GAC TTC TAC TGT GCG TGC GAC GAC GGC TGG AAG 19 36 

Tyr Aep Leu Val Asn Aep Phe Tyr Cye Ala Cye Asp Asp Gly Trp Lys 
520 525 530 535 
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GGC AAG ACC TCC CAC TCA CGC GAG TTC CAG TCC GAT GCC TAC ACC TGC 19B4 
Gly Lye Thr Cys His Ser Arg Glu Phe Gin Cys Asp Ala Tyr Thr Cys 
540 545 S50 

AGC AAC GGT GGC ACC TGC TAC CAC AGC GGC GAC ACC TTC CGC TGC GCC 2032 
Ser Asn Gly Gly Thr Cys Tyr Asp Ser Gly Asp Thr Phe Arg Cy B Ala 
555 560 565 

TGC CCC CCC GCC TGG AAG GCC AGC ACC TCC GCC GTC GCC AAC AAC AGC 
Cys Pro Pro Gly Trp Lys Gly Ser Thr Cys Ala Val Ala Lys Asn Ser 
570 575 580 

AGC TCC CTC CCC AAC CCC TGT GTC AAT GGT GCC ACC TGC CTG GGC AGC 2128 
Ser Cys Leu Pro Asn Pro Cys Val Asn Gly Gly Thr Cys Val Gly Ser 
585 590 595 

CGC GCC TCC TTC TCC TCC ATC TCC CCC GAC GCC TGG GAC OCT CCT ACT 2176 
Gly Ala Ser Phe Ser Cys He Cys Arg Asp Gly Trp Glu Gly Arg Thr 
600 605 610 615 

TGC ACT CAC AAT ACC AAC GAC TGC AAC CCT CTG CCT TGC TAC AAT GGT 2224 
Cys Thr His Asn Thr Asn Asp Cys Asn Pro Leu Pro Cys Tyr Asn Gly 
620 625 630 

GGC ATC TGT CTT GAC GCC GTC AAC TGC TTC CCC TGC GAG TGT GCA CCT 2272 
Cly He Cys Val Asp Gly Val Asn Trp Phe Arg Cys Glu Cys Ala Pro 
635 640 645 

CGC TTC GCG GCC CCT GAC TGC CGC ATC AAC ATC GAC GAG TCC CAG TCC 2 320 

Gly Phe Ala Gly Pro Asp Cys Arg He Asn He Asp Glu Cys Gin Ser 
650 655 660 

TCC CCC TGT GCC TAC GGG GCC ACG TGT GTG GAT GAC ATC AAC GGC TAT 2 368 

Ser Pro Cys Ala Tyr Cly Ala Thr Cys Val Asp Glu He Asn Cly Tyr 
665 670 675 

CGC TGT AGC TGC CCA CCC GGC CGA GCC GGC CCC CGG TGC CAC GAA CTG 2416 
Arg Cys Ser Cys Pro Pro Gly Arg Ala Gly Pro Arg Cys Gin Glu Val 
680 685 690 695 

ATC GCC TTC GCG AG A TCC TGC TGG TCC CGG GGC ACT CCG TTC CCA CAC 24 64 

He Gly Phe Gly Arg Ser Cys Trp Ser Arg Gly Thr Pro Phe Pro Hie 
700 705 710 

GGA AGC TCC TGC GTC GAA GAC TGC AAC AGC TGC CGC TGC CTG GAT GGC 2 512 

Gly Ser Ser Trp Val Glu ABp Cys Asn Ser Cys Arg Cys Leu Asp Gly 
715 720 725 

CGC CGT CAC TGC AGC AAG GTC TGG TGC GGA TGG AAG CCT TGT CTG CTG 2 560 

Arg Arg Asp Cys Ser Lys Val Trp Cys Gly Trp Lys Pro Cys Leu Leu 
730 735 740 

GCC CCC CAG CCC GAG GCC CTG AGC GCC CAG TGC CCA CTG GGC CAA AGC 2608 
Ala Gly Gin Pro Glu Ala Leu Ser Ala Gin Cys Pro Leu Gly Gin Arg 
745 750 755 

TGC CTG GAG AAG GCC CCA GGC CAG TGT CTG CGA CCA CCC TGT GAG GCC 2656 
Cys Leu Glu Lys Ala Pro Gly Gin Cys Leu Arg Pro Pro Cys Glu Ala 
760 765 770 775 

TGG GGG GAG TGC GCC CCA GAA GAG CCA CCG AGC ACC CCC TGC CTG CCA 2704 
Trp Gly Glu Cys Gly Ala Glu Glu Pro Pro Ser Thr Pro Cys Leu Pro 
780 785 790 

CGC TCC GCC CAC CTG GAC AAT AAC TGT GCC CGC CTC ACC TTG CAT TTC 2 7 52 

Arg Ser Gly His Leu Asp Aen Asn Cys Ala Arg Leu Thr Leu His Phe 
795 800 805 
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AAC CGT GAC CAC GTG CCC CAG GGC ACC ACG GTG GGC GCC ATT TGC TCC 2 800 

Ann Arg Aep His Val Pro Gin Gly Thr Thr Val Gly Ala lie Cvs Ser 
810 815 820 

GGG ATC CGC TCC CTC CCA GCC ACA AGG GCT GTG GCA CGG GAC CGC CTG 2848 
Gly lie Arg Ser Leu Pro Ala Thr Arg Ala Val Ala Arg Asp Arg Leu 
825 830 835 

CTG GTG TTG CTT TGC GAC CGG GCG TCC TCG GGG GCC ACT GCT GTG GAG 2 896 

Leu Val Leu Leu Cye Aep Arg Ala Ser Ser Gly Ala Ser Ala Val Glu 
840 845 850 8 55 

GTG GCC GTG TCC TTC AGC CCT GCC AGG GAC CTC CCT GAC AGC AGC CTG 2944 
Val Ala Val Ser Phe Ser Pro Ala Arg Asp Leu Pro Aap Ser Ser Leu 
860 865 870 

ATC CAG GGC GCC GCC CAC GCC ATC CTG GCC CCC ATC ACC CAG CGG CGG 2992 
lie Gin Gly Ala Ala His Ala lie Val Ala Ala lie Thr Gin Arg Gly 
875 880 885 

AAC ACC TCA CTG CTC CTG GCT CTC ACC GAG GTC AAG GTG GAG ACG GTT 3040 
Aan Ser Ser Leu Leu Leu Ala Val Thr Glu Val Lys Val Glu Thr Val 
890 895 900 

GTT ACG GGC GGC TCT TCC ACA GGT CTG CTG GTC CCT GTG CTG TGT GGT 3088 
Val Thr Gly Gly Ser Ser Thr Gly Leu Leu Val Pro Val Leu Cvs Glv 
905 910 915 

GCC TTC AGC GTC CTG TGG CTG GCG TGC GTG GTC CTG TGC GTG TGG TGG 3136 
Ala Phe Ser Val Leu Trp Leu Ala Cys Val Val Leu Cys Val Trp Tro 
92 ° 925 930 935 

ACA CGC AAG CGC AGG AAA GAG CGG GAG AGG AGC CGG CTG CCG CGG GAG 3184 
Thr Arg Lys Arg Arg Lys Glu Arg Glu Arg Ser Arg Leu Pro Arg Glu 
940 945 950 

GAG AGC GCC AAC AAC CAG TGG GCC CCG CTC AAC CCC ATC CCC AAC CCC 3232 
Glu Ser Ala Ann Asn Gin Trp Ala Pro Leu Aen Pro lie Arg Asn Pro 
955 960 965 

ATT GAG CGG CCG GGG GGG CAC AAG GAC GTG CTC TAC CAG TGC AAG AAC 3280 
He Glu Arg Pro Gly Gly His Ly b Asp Val Leu Tyr Gin Cys Lys Asn 
970 975 980 

TTC ACT CCA CCG CCG CGC AGG CGC TGC CCG GCC CGG CCG GCC ACG CGG 3328 
Phe Thr Pro Pro Pro Arg Arg Arg Cys Pro Gly Arg Pro Ala Thr Arq 
985 990 995 

CCG TCA GGG AGG ATG AGG AGG ACG AGG ATC TTG GCC GCG GTG AGG AGG 3376 
Pro Ser Gly Arg Met Arg Arg Thr Arg He Leu Ala Ala Val Arg Arg 
1000 1005 1010 1015 

ACT CCC TGG AGG CGG ACA ACT TCC TCT CAC ACA AAT TCA CCA AAG ATC 3424 
Thr Pro Trp Arg Arg Arg Ser Ser Ser His Thr Aan Ser Pro Lys He 
1020 1025 1030 

CTG GCC GCT CGC CGG GGA GGC CGG CCC ACT GGC CCT CAG GCC CCA AAG 3472 
Leu Ala Ala Arg Arg Gly Gly Arg Pro Thr Gly Pro Gin Ala Pro Lye 
1035 1040 1045 

TCC ACA ACC GCG CGC TCA GGA GCA TCA ATG AGG CCC GCT ACG TCG GCA 3520 
Trp Thr Thr Ala Arg Ser Gly Ala Ser Met Arg Pro Ala Thr Ser Ala 
1050 1055 1060 

AGG GAA GTA GGG CGG CTG CAG CTC GGC CGG GAC CCA GGC CCC TCG GTG 3568 
Arg Glu Val Gly Arg Leu Gin Leu Gly Arg Asp Pro Gly Pro Ser Val 
1065 1070 1075 
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GCA GCC ATG CCG TCT GCC GGA CCC GGA GGC CGA GGC CAT GTG CAT AGT 3 616 

Gly Ala Met Pro Ser Ala Gly Pro Gly Gly Arg Gly Hie Val Hie Ser 

1080 1085 1090 1095 

TTC TTT ATT TTG TGT AAA AAA ACC ACC AAA AAC AAA AAC CAA ATG TTT 3664 

Phe Phe He Leu Cye Lys Lyo Thr Thr Lys Aen Lye Asn Gin Met Phe 
1100 1105 1110 

ATT TTC TAC GTT TCT TTA ACC TTG TAT AAA TTA TTC AGT AAC TGT CAG 3712 

He Phe Tyr Val Ser Leu Thr Leu Tyr Lye Leu Phe Ser ABn Cya Gin 
1115 1120 1125 

GCT GAA AAC AAT GGA GTA TTC TCG CAT ACT TGC TAT TTT TGT AAA GTA 3760 

Ala Glu Aen Aen Gly Val Phe Ser ABp Ser Cye Tyr Phe Cye Lye Val 
1130 1135 1140 

GCC GTG CGT GGC ACT CGC TGT ATG AAA GGA GAG AGC AAA GGG TGT CTG 3808 

Ala Val Arg Gly Thr Arg Cys Met Lye Gly Glu Ser Lye Gly Cye Leu 
1145 1150 1155 

CGT CGT CAC CAA ATC GTC GCC TTT GTT ACC AGA GGT TGT GCA CTG TTT 3856 

Arg Arg Hie Gin He Val Ala Phe Val Thr Arg Gly Cye Ala Leu Phe 

1160 1165 1170 1175 

ACA GAA TCT TCC TTT TAT TCC TCA CTC GGG TTT CTC TGT GCT CCA GGC 3904 

Thr Glu Ser Ser Phe Tyr Ser Ser Leu Gly Phe Leu Cye Ala Pro Gly 
1180 1185 1190 

CAA AGT GCC GGT GAG ACC CAT GGC TGT GTT GGT GTG GCC CAT GGC TGT 39 52 

Gin Ser Ala Gly Glu Thr Hie Gly Cye Val Gly Val Ala Hie Gly Cye 
1195 1200 1205 

TGC TGG GAC CCG TGG CTG ATG GTG TGG CCT GTG GCT GTC GGT GGG ACT 4000 

Trp Trp Aep Pro Trp Leu Met Val Trp Pro Val Ala Val Gly Gly Thr 
1210 1215 1220 

CGT GGC TGT CAA TGG GAC CTG TGG CTG TCG GTG GGA CCT ACG GTG GTC 4048 

Arg Gly Cye Gin Trp Aep Leu Trp Leu Ser Val Gly Pro Thr val Val 
1225 1230 1235 

GGT GGG ACC CTG GTT ATT GAT GTG GCC CTG GCT GCC GGC ACG GCC CGT 4096 

Gly Gly Thr Leu Val He Aep Val Ala Leu Ala Ala Gly Thr Ala Arg 

1240 1245 1250 1255 

GGC TGT TG ACGCACCT GTGGTTGTTA GTGGGGCCTG AGGTCATCGGC GTGGCCCAAG 4154 
Gly Cye 



GCCGGCAGGT 


CAACCTCGCC 


CTTGCTGGCC 


AGTCCACCCT 


GCCTGCCGTCT 


GTGCTTCCTC 


4214 


CTCCCCAGAA 


CGCCCGCTCC 


AG CG ATCTCT 


CCACTGTGCT 


TTCAGAAGTGC 


CCTTCCTGCT 


4274 


GCGCAGTTCT 


CCCATCCTGC 


GACGGCGGCA 


GTATTGAAGC 


TCGTGACAAGT 


GCCTTCACAC 


4334 


AGACCCCTCC 


CAACTGTCCA 


CGCGTGCCGT 


GGCACCAGGC 


GCTGCCCACCT 


GCCGGCCCCG 


4394 


GCCGCCCCTC 


CTCGTGAAAG 


TGCATTTTTG 


TAAATGTGTA 


CATATTAAAGG 


AAGCACTCTG 


4454 


TATAAAAAAA 


AAAAACCGGA 


ATTCC 








4483 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1384 amino acide 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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<ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Met lie Aen Pro Glu Aep Arg Trp Lye Ser Leu Hie Phe Ser Gly Hi 

5 10 15 

Val Ala Hie Leu Glu Leu Gin He Arg Val Arg Cye Aep Glu Aen Tyr 
20 25 30 

Tyr Ser Ala Thr Cye Aen Lye Phe Cye Arg Pro Arg Aen Aep Phe Phe 

40 45 

Cly Hie Tyr Thr Cye Aep Gin Tyr Gly Aen Lye Ala Cye Met Aep Gly 
=> u 55 60 

Trp Met Gly Lye Glu Cye Lye Glu Ala Val Cye Lye Gin Gly Cye Aen 

70 75 80 

Leu Leu Hie Gly Gly Cye Thr Val Pro Gly Glu Cye Arg Cye Ser Tyr 

85 90 95 

Cly Trp Gin Gly Arg Phe Cye Aep Glu Cye Val Pro Tyr Pro Gly eve 
100 105 i 10 1 

Val Hie Gly Ser Cye Val Glu Pro Trp Gin Cye Aen Cye Glu Thr Aen 
i15 120 i2s 

Trp Cly Cly Leu Leu Cye Aep Lye Aep Leu Aen Tyr Cye Cly Ser Hie 
1JU 135 140 

Hie Pro eye Thr Aen Gly Gly Thr Cye lie Aen Ala Glu Pro Asp Gin 
145 150 155 * 160 

Tyr Arg Cye Thr Cye Pro Aep Gly Tyr Ser Gly Arg Aen Cye Glu Lye 
165 170 

Ala Glu Hie Ala Cye Thr Ser Aen Pro Cye Ala Aen Gly Gly Ser Cye 
180 185 190 

Hie Glu Val Pro Ser Gly Phe Glu Cye Hie Cye Pro Ser Gly Trp Ser 
195 200 205 

Cly Pro Thr Cye Ala Leu Aep He Aep Glu Cye Ala Ser Aen Pro Cye 
210 215 220 

Ala Ala Cly Gly Thr Cye Val Aep Gin Val Aep Gly Phe Glu Cye II 
225 230 235 

Cy. Pro Glu Gin Trp Val Cly Ala Thr Cye Gin Leu Aep Ala Aen Glu 
245 250 255 

Cye Glu Gly Lye Pro Cye Leu Aen Ala Phe Ser Cye Lye Aen Leu He 
260 265 270 

Cly Gly Tyr Tyr Cye Aep Cye He Pro Gly Trp Lye Cly He Aen Cye 

2 '5 280 285 

Hie lie Aen Val Aen Aep Cye Arg Gly Gin Cye Gin Hie Gly Gly Thr 
290 295 300 

Cye Lye Aep Leu Val Aen Cly Tyr Gin Cye Val Cye Pro Arg Gly Phe 
305 310 315 320 

Cly Gly Arg Hie Cye Glu Leu Glu Arg Aep Lye Cye Ala Ser Ser Pro 
325 330 335 
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Cye Hie Ser Gly Gly Leu Cye Glu Asp Leu Ala Asp Gly Phe His Cys 
340 345 350 

Hie Cye Pro Gin Gly Phe Ser Gly Pro Leu Cys Glu Val Asp Val Asp 
355 360 365 

Leu Cys Glu Pro Ser Pro Cye Arg Asn Gly Ala Arg Cys Tyr Asn Leu 
370 375 380 

Glu Gly Asp Tyr Tyr Cys Ala Cys Pro Asp Asp Phe Gly Gly Lye Asm 
385 390 395 4 ° u 

Cys Ser Val Pro Arg Glu Pro Cys Pro Gly Gly Ala Cys Arg Val lie 
1 405 410 415 

Asp Gly Cys Gly Ser Asp Ala Gly Pro Gly Met Pro Gly Thr Ala Ala 
420 425 430 

ser Gly Val Cys Gly Pro His Gly Arg Cys Val Ser Gin Pro Gly Gly 
435 440 445 

Aen Phe Ser Cys He Cye Asp Ser Gly Phe Thr Gly Thr Tyr Cys His 
450 455 460 

Glu Aen He Asp Asp Cye Leu Gly Gin Pro Cys Arg Asn Gly Gly Thr 
465 470 475 480 

Cys He Asp Glu Val Asp Ala Phe Arg Cys Phe Cys Pro Ser Gly Trp 
485 490 4 " 

Glu Gly Glu Leu Cye Asp Thr Aen Pro Asn Aep Cye Leu Pro Asp Pro 
500 505 510 

Cys His Ser Arg Gly Arg Cye Tyr Asp Leu Val Asn Asp Phe Tyr Cys 
515 520 525 

Ala Cys Asp Asp Gly Trp Lys Gly Lye Thr Cys Hie Ser Arg Glu Phe 
530 535 540 

Gin Cye Aep Ala Tyr Thr Cye Ser Asn Gly Gly Thr Cys Tyr Asp Ser 
545 550 555 =>bU 

Gly Aep Thr Phe Arg Cye Ala Cye Pro Pro Gly Trp Lye Gly Ser Thr 
565 570 a/3 

Cye Ala Val Ala Lye Aen Ser ser Cye Leu Pro Asn Pro Cys Val Aen 
580 585 590 

Gly Gly Thr Cye Val Gly Ser Gly Ala Ser Phe Ser Cye lie Cys Arg 
595 600 605 

Aep Gly Trp Glu Gly Arg Thr Cye Thr Hie Aen Thr Aen Asp Cys Aen 
610 615 620 

Pro Leu Pro Cye Tyr Aen Gly Gly lie Cye Val Asp Gly Val Aen Trp 
625 630 635 *™ 

Phe Arg Cye Glu Cye Ala Pro Gly Phe Ala Gly Pro Aep Cye Arg lie 
645 650 e3S 

Asn lie Asp Glu Cye Gin Ser Ser Pro Cys Ala Tyr Gly Ala Thr Cye 
660 665 670 

Val Aep Glu lie Aen Gly Tyr Arg Cys Ser Cye Pro Pro Gly Arg Ala 
675 680 685 

Gly Pro Arg Cye Gin Glu Val lie Gly Phe Gly Arg Ser Cye Trp Ser 
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690 695 700 

Arg Gly Thr Pro Phe Pro His Gly Ser Ser Trp Val Glu Aep Cys Aen 
705 710 715 720 

Ser Cys Arg Cye Leu Aep Gly Arg Arg Asp Cys Ser Lys Val Trp Cys 
725 730 735 

Gly Trp Lys Pro Cys Leu Leu Ala Gly Gin Pro Glu Ala Leu Ser Ala 
740 745 750 

Gin Cys Pro Leu Gly Gin Arg Cys Leu Glu Lyo Ala Pro Gly Gin Cye 
755 760 765 

Leu Arg Pro Pro Cye Glu Ala Trp Gly Glu Cys Gly Ala Glu Glu Pro 
770 775 780 

Pro Ser Thr Pro Cys Leu Pro Arg Ser Gly His Leu Asp Asn Asn Cys 
785 790 795 800 

Ala Arg Leu Thr Leu His Phe Asn Arg Asp His Val Pro Gin Gly Thr 
805 810 815 

Thr Val Gly Ala lie Cys Ser Gly He Arg Ser Leu Pro Ala Thr Arg 
820 825 830 

Ala Val Ala Arg Aep Arg Leu Leu Val Leu Leu Cys Asp Arg Ala Ser 
835 840 845 

Ser Gly Ala Ser Ala Val Glu Val Ala Val Ser Phe Ser Pro Ala Arg 
850 855 860 

Asp Leu Pro Asp Ser Ser Leu He Gin Gly Ala Ala His Ala He Val 
865 870 675 8B0 

Ala Ala He Thr Gin Arg Gly Asn Ser Ser Leu Leu Leu Ala Val Thr 
885 890 895 

Glu Val Lys Val Glu Thr Val Val Thr Gly Gly Ser Ser Thr Gly Leu 
900 905 910 

Leu Val Pro Val Leu Cys Gly Ala Phe Ser Val Leu Trp Leu Ala Cys 
915 920 925 

Val Val Leu Cys Val Trp Trp Thr Arg Lys Arg Arg Lys Glu Arg Glu 
930 935 940 

Arg Ser Arg Leu Pro Arg Glu Glu Ser Ala Asn Asn Gin Trp Ala Pro 
945 950 955 960 

Leu Asn Pro He Arg Asn Pro He Glu Arg Pro Gly Gly Hie Lys Asp 
965 970 975 

Val Leu Tyr Gin Cye Lye Asn Phe Thr Pro Pro Pro Arg Arg Arg Cys 
980 985 990 

Pro Gly Arg Pro Ala Thr Arg Pro Ser Gly Arg Met Arg Arg Thr Arg 
995 1000 1005 

He Leu Ala Ala Val Arg Arg Thr Pro Trp Arg Arg Arg Ser Ser Ser 
1010 1015 1020 

Hie Thr Asn Ser Pro Lys He Leu Ala Ala Arg Arg Gly Gly Arg Pro 
1025 1030 103S 1040 

Thr Gly Pro Gin Ala Pro Lye Trp Thr Thr Ala Arg Ser Gly Ala Ser 
1045 10S0 1055 
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Met Arg Pro Ala Thr Ser Ala Arg Glu Val Gly Arg Leu Gin Leu Gly 
1060 1065 1070 

Arg Aep Pro Gly Pro Ser Val Gly Ala Met Pro Ser Ala Gly Pro Gly 
1075 1080 1085 

° ly A n2 ft Gly Hi ° Val HiB Ser Phe Phe Ile Leu c /8 Lye Lye Thr Thr 
1090 1095 no© 

Lye Aen Lye Aen Gin Met Phe He Phe Tyr Val Ser Leu Thr Leu Tyr 

1105 mo ins 1 J 20 

Lys Leu Phe Ser Aen Cye Gin Ala Glu Aen Aen Gly Val Phe Ser Aep 
1125 H30 1135 * 

Ser Cye Tyr Phe Cye Lye Val Ala Val Arg Gly Thr Arg Cye Met Lye 
1140 1145 iiso * 

Gly Glu Ser Lye Gly Cye Leu Arg Arg Hie Gin He Val Ala Phe Val 
1155 1160 H65 

Thr A T?n Gly CyS AIa Leu Phe Thr Glu Ser Ser Phe Tyr Ser Ser Leu 
il7 ° 1175 1180 

Gly Phe Leu Cye Ala Pro Gly Gin Ser Ala Gly Glu Thr Hie Gly Cye 
1185 119 ° H95 ^ 0Q 

Val Gly Val Ala Hie Gly Cye Trp Trp Aep Pro Trp Leu Met Val Trp 
1205 1210 1215 

Pro Val Ala Val Gly Gly Thr Arg Gly Cye Gin Trp Aep Leu Trp Leu 
12 20 1225 1230 

Ser Val Gly Pro Thr Val Val Gly Gly Thr Leu Val Ile Asp Val Ala 
1235 1240 1245 

Leu Ala Ala Gly Thr Ala Arg Gly Cye 
1250 1255 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3562 base paire 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: cDNA 



( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..3582 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CAC GTG GCG TCA GCA TCG GGA CAG TTC GAG CTG GAG ATC TTA TCC GTG 48 
Gin Val Ala Ser Al« Ser Gly Gin Phe Glu Leu Glu Ile Leu Ser Val 
15 10 15 

CAC AAT GTG AAC GGC GTG CTG CAG AAC GGG AAC TGC TGC GAC GGC ACT 96 
Gin Aen Val Aen Gly Val Leu Gin Aen Gly Aen Cye Cye Aep Gly Thr 
20 25 30 
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CGA AAC CCC GGA GAT AAA AAG TGC ACC AG A GAT GAG TGT GAC ACC TAC 144 
Arg Aon Pro Gly Asp Lye Lys Cye Thr Arg Asp Glu Cys Asp Thr Tyr 
35 40 45 

TTT AAA CTT TGC CTG AAG GA TAC CAG TCG CGG GTC ACT GCT GGC GGC 192 
Phe Lye Val Cys Leu Lye Glu Tyr Gin Ser Arg Val Thr Ala Gly Glv 
50 55 60 

CCT TGC AGC TTC GGA TCC AAA TCC ACC CCT GTC ATC CGC GGG AAT ACC 240 
Pro Cye Ser Phe Gly Ser Lye Ser Thr Pro Val He Gly Gly Asn Thr 

65 70 75 . 80 

TTC AAT TTA AAG TAC AGC CGG AAT AAT GAA AAG AAC CGG ATT GTT ATC 288 
Phe Asn Leu Lye Tyr Ser Arg Aan Aon Glu Lys Asn Arg He Val lie 
85 90 95 

CCT TTC ACC TTC GCC TCG CCG AG A TCC TAC ACG TTG CTT GTT GAG CCA 336 
Pro Phe Thr Phe Ala Trp Pro Arg Ser Tyr Thr Leu Leu Val Glu Ala 
100 105 no 

TGC GAT TAC AAT GAT AAC TCT ACT AAT CCC GAT CGC ATA ATT GAG AAG 384 
Trp Aep Tyr Aen Asp Asn Ser Thr Asn Pro Asp Arg He He Glu Lvs 
115 120 125 

GCA TCC CAC TCT GGC ATG ATC AAT CCA AGC CGT CAG TGG CAG ACG TTG 432 
Ala Ser HiB Ser Gly Met He Asn Pro Ser Arg Gin Trp Gin Thr Leu 
130 135 140 

AAA CAT AAC ACA GGA GCT GCC CAC TTT GAG TAT CAA ATC CGT GTG ACT 480 
Lys His Aen Thr Gly Ala Ala His Phe Glu Tyr Gin He Arg Val Thr 
145 150 155 160 

TGC GCA GAA CAT TAC TAT GGC TTT GGA TGC AAC AAG TTT TGT CGA CCG 528 
Cys Ala Glu His Tyr Tyr Gly Phe Gly Cys Aen Lys Phe Cys Arg Pro 
165 170 175 

AGA GAT GAC TTC TTC ACT CAC CAT ACC TGT GAC CAG AAT GGC AAC AAA 576 
Arg Asp Aep Phe Phe Thr His His Thr Cye Asp Gin Aen Gly Aen Lye 
180 185 190 

ACC TGC TTG CAA GGC TGG ACG GGA CCA GAA TGC AAC AAA GCT ATT TGT 624 
Thr Cys Leu Glu Gly Trp Thr Gly Pro Glu Cye Asn Lys Ala He Cye 
195 200 205 

CGT CAG GGA TGT AGC CCC AAG CAT GGT TCT TGC ACA GTT CCA GGA GAG 672 
Arg Gin Gly Cys Ser Pro Lye His Gly Ser Cys Thr Val Pro Gly Glu 
210 215 220 

TGC ACG TCT CAG TAT GGA TGG CAA GGC CAG TAC TGT GAT AAG TGC ATT 720 
Cys Arg Cys Gin Tyr Gly Trp Gin Gly Gin Tyr Cys Asp Lys Cys He 
225 230 235 240 

CCA CAC CCG GGA TGT GTC CAT GGC ACT TGC ATT GAA CCA TGC CAC TGC 768 
Pro His Pro Gly Cys Val Hie Gly Thr Cys He Glu Pro Trp Gin Cye 
245 250 255 

CTC TGT GAA ACC AAC TGG GGT GGT CAG CTC TGT GAC AAA GAC CTG AAC 816 
Leu Cys Glu Thr Aen Trp Gly Gly Gin Leu Cye Aep Lys Aep Leu Aen 
260 265 270 

TAC TGT GGA ACC CAC CCA CCC TGT TTG AAT GGT GGT ACC TGC AGC AAC 864 
Tyr Cys Gly Thr His Pro Pro Cys Leu Asn Gly Gly Thr Cys Ser Asn 
275 280 285 

ACT GGC CCC GAT AAA TAC CAC TGT TCC TGC CCT GAG GGT TAC TCA GGA 912 
Thr Gly Pro Asp Lys Tyr Gin Cys Ser Cys Pro Glu Gly Tyr Ser Gly 
290 295 300 
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CAC AAC TGT GAA ATA GCG GAG CAT GCG TGC CTC TCT GAT CCG TGC CAC 960 
Gin Asn Cye Glu lie Ala Glu His Ala Cys Leu Ser Asp Pro Cys His 
305 310 315 320 

AAC GGA GGA AGC TGC CTA GAA ACG TCT ACA GGA TTT GAA TGT GTG TGT 1008 
Aon Gly Gly Ser Cys Leu Glu Thr Ser Thr Gly Phe Glu Cys Val Cys 
325 330 335 

GCA CCT GGC TGG GCT GGA CCA ACT TGC ACT GAT AAT ATT GAT GAT TGT 1056 
Ala Pro Gly Trp Ala Gly Pro Thr Cys Thr Asp Asn lie Asp Asp Cye 
340 345 350 

TCT CCA AAT CCC TGT GGT CAT GGA GGA ACT TGC CAA GAT CTA GTT GAT 1104 
Ser Pro Asn Pro Cys Gly His Gly Gly Thr Cye Gin Asp Leu Val Asp 
355 360 365 

GGA TTT AAG TCT ATT TGC CCA CCT CAG TGG ACT GGC AAA ACA TGC CAG 1152 
Gly Phe Lye Cys He Cys Pro Pro Gin Trp Thr Gly Lye Thr Cys Gin 
370 375 380 

CTA GAT GCG AAT GAA TGT GAG GGC AAA CCC TGT GTC AAT GCC AAC TCC 1200 
Leu Asp Ala Asn Glu Cys Glu Gly Lys Pro Cys Val Asn Ala Asn Ser 
385 390 395 400 

TGC ACG AAC TTG ATT GGC AGC TAC TAT TGT GAC TGC ATT ACT GGC TGG 1248 
Cys Arg Asn Leu He Gly Ser Tyr Tyr Cys Asp Cys He Thr Gly Trp 
405 410 415 

TCT GGC CAC AAC TGT GAT ATA AAT ATT AAT GAT TGT CGT GGA CAA TGT 1296 
Ser Gly His Asn Cys Asp He Asn He Asn Asp Cys Arg Gly Gin Cys 
420 425 430 

CAG AAT GGA GGA TCC TCT CGG GAC TTG GTT AAT GGT TAT CGG TGC ATC 1344 
Gin Asn Cly Gly Ser CyB Arg Asp Leu Val Asn Gly Tyr Arg Cys He 
435 440 445 

TGT TCA CCT GGC TAT GCA GGA GAT CAC TGT GAG AAA GAC ATC AAT GAA 1392 
Cys Ser Pro Gly Tyr Ala Gly Asp His Cys Glu Lys Asp He Asn Glu 
450 455 460 

TGT GCA AGT AAC CCT TGC ATG AAT GGG GGT CAC TGC CAG GAT GAA ATC 1440 
Cys Ala Ser Asn Pro Cys Met Asn Gly Gly HiB Cys Gin Asp Glu He 
465 470 475 480 

AAT GGA TTC CAA TGT CTG TGT CCT GCT GGT TTC TCA GGA AAC CTC TGT 1488 
Asn Gly Phe Gin Cys Leu Cys Pro Ala Gly Phe Ser Gly Asn Leu Cys 
485 490 495 

CAG CTC GAT ATA GAC TAC TGT GAG CCA AAC CCT TGC CAG AAC GGT GCC 1536 
Gin Leu Asp He Asp Tyr Cys Glu Pro Asn Pro Cys Gin Asn Gly Ala 
500 505 510 

CAG TGC TTC AAT CTT GCT ATG GAC TAT TTC TGT AAC TGC CCT GAA GAT 1584 
Gin Cys Phe Asn Leu Ala Met Asp Tyr Phe Cys Asn Cys Pro Glu Asp 
515 520 525 

TAC GAA GCC AAG AAC TGC TCC CAC CTC AAA GAT CAC TGC CGC ACA ACT 1632 
Tyr Glu Gly Lys Asn Cys Ser His Leu Lys Asp His Cys Arg Thr Thr 
530 535 540 

CCT TCT CAA GTA ATC GAC AGC TGT ACA GTG GCA GTG GCT TCT AAC AGC 1680 
Pro Cys Glu Val He Asp Ser Cys Thr Val Ala Val Ala Ser Asn Ser 
545 550 / 555 560 

ACA CCA GAA GGA GTT CGT TAC ATT TCT TCA AAT GTC TGT GGT CCT CAT 1728 
Thr Pro Glu Cly Val Arg Tyr He Ser Ser Asn Val Cys Gly Pro His 
565 570 575 
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GGA AAA 
Gly Lys 



AAA GGA 
Lys Gly 



AGC AAC 
Ser Aen 
610 

TAC AAA 
Tyr Lye 
625 

AAT ATT 
Aen lie 



CGA GAC 
Arg Asp 

GGA AAA 
Gly Lye 



AAT AAT 
Aen Aen 
690 

TGT CCT 
Cye Pro 
705 

AGC TGC 
Ser Cye 



GGG CAT 
Gly Aep 



TGT ACT 
Cye Thr 



GGT ACT 
Gly Thr 
770 

GGC TTC 
Gly Phe 
785 

TCA CCC 
Ser Pro 



CGT TGC 
Arg Cys 

ACA GGG 
Thr Gly 



TGC AAG AGC 
Cys Lys Ser 
580 

TTC ACT GGC 
Phe Thr Gly 
595 

CCC TGT AAA 
Pro Cye Lye 



TGT ATT TGT 
Cye lie Cye 



AAT GAC TGC 
Asn Aep Cye 
645 

TTG GTC AAT 
Leu Val Aan 
660 

ACT TGC CAC 
Thr Cye Hie 
675 

GGA GGA ACA 
Gly Gly Thr 



GCA GGA TGG 
Ala Gly Trp 



CAA GCA 
Gin Ala 



ACC TAC 
Thr Tyr 



CTG CCA AAC 
Leu Pro Aen 

725 

TCT TTC ACT 
Ser Phe Thr 
740 

CAG AAC ACA 
Gin Aen Thr 
755 

TGT GTG GAT 
Cye Val Aep 



GCA GGT CCC 
Ala Gly Pro 



AAT GGT 
Aen Gly 
615 

AGT GAT 
Ser Aep 
630 

AGT AAA 
Ser Lye 



GAC TTC 
Aep Phe 

TCT CGT 
Ser Arg 



TGT TAT 
Cye Tyr 
695 

GAA GGA 
Glu Gly 
710 

CCC TGT 
Pro Cye 



GGT GGA 
Gly Gly 
585 

TGT CAT 
Cye Hie 
600 

GGC ACT 
Gly Thr 



GGA TGG 
Gly Trp 

AAC CCC 
Aen Pro 



TGT GTC 
Cye Val 

AAT GAC 
Aen Aep 



TGT GCC TTT 
Cye Ala Phe 
805 

ATT TGT CCA 
lie Cye Pro 
620 

AGG CCT TGC 
Arg Pro Cye 
835 



GGA 
Gly 



GAC 
Aep 
790 

GGG 
Gly 



GAC 
Aep 

775 

TGT 
Cye 



GCT 
Ala 



CCG GGT 
Pro Gly 



TTT ACC 
Phe Thr 



TTC TGT 
Phe Cye 
665 

GAC AGC 
Aep Ser 
680 

GAT GAG 
Asp Glu 



GCC ACT 
Ala Thr 



CAC AAT 
Hie Asn 



TGC AAG 
Cye Lye 
745 

TGC AGT 
Cye Ser 
760 

AAC TGG 
Aen Trp 



AGG ATC 
Arg lie 



ACT TGT 
Thr Cye 



CGC AGT 
Arg Ser 
825 

AGT ATT 
Ser lie 
•840 



AAA TTC ACC 
Lys Phe Thr 



GAG AAT ATC 
Glu Aen lie 



TGT ATT GAC 
Cye lie Asp 
620 

GAA GGA ACA 
Glu Gly Thr 
635 

TGC CAC AAT 
Cye Hie Aen 
650 

GAA TGT AAA 
Glu Cye Lys 



CAG TGT GAT 
Gin Cye Asp 



GGG GAC ACT 
Gly Asp Thr 
700 

TGT AAT ATA 
Cye Asn lie 
715 

GGT GGT ACC 
Gly Gly Thr 
730 

GAG GGC TGG 
Glu Gly Trp 



TGT GAA TGC AAC 
Cye Glu Cye Asn 
590 

AAT GAC TGT GAG 
Aen Asp Cye Glu 
605 

GGT GTA AAC TCC 
Gly Val Aen Ser 



TAT TCT GAA ACA 
Tyr Cye Glu Thr 
640 

GGA GGA ACT TGC 
Gly Gly Thr Cye 
655 

AAT GGG TGG AAA 
Aen Gly Trp Lye 
670 

GAG GCA ACA TGC 
Glu Ala Thr Cys 
685 

TTC AAG TGC ATG 
Phe Lye Cys Met 



CCT CAT CCT 
Pro Hie Pro 



TAC CGC TGT 
Tyr Arg Cye 
780 

AAC ATC AAT 
Aen lie Aen 
795 

GTG GAT GAA 
Val Aep Glu 
810 

GGT CCA GGA 
Gly Pro Gly 



CGA GTA ATG 
Arg Val Met 



GCA AGG AAC AGC 

Ala Arg Asn Ser 

720 

TGT GTA GTT AGT 
Cys Val Val Ser 

735 

GAA GGA CCG ACA 
Glu Gly Pro Thr 
750 

TGT TAC AAC AGT 
Cys Tyr Asn Ser 
765 

GAG TGC GCT CCC 
Glu Cys Ala Pro 



GAA TGT CAG TCT 
Glu Cye Gin Ser 
800 

ATT AAT GGG TAC 
lie Aen Gly Tyr 
815 

TGC CAG GAA GTT 
Cye Gin Glu Val 
830 

CCA GAC GGT GCT 
Pro Aep Gly Ala 
845 



1776 



1824 



1872 



1920 



1968 



2016 



2064 



2112 



2160 



2208 



2256 



2304 



2352 



2400 



2448 



2496 



2544 
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AAG TGG GAT GAT GAC TGT AAT ACT TGT CAG TGT TTG AAT GGA AAA GTC 
Lye Trp Asp Asp Asp Cye Asn Thr Cys Gin Cys Leu Gly VaT 

" u 855 860 

ACC TGT TCT AAG GTT TGG TGT GGT CCT CGA CCT TGT ATA ATA CAT GCC 
Thr Cys Ser Lye Val Trp Cys Gly Pro Arg Pro Cyl lie lie His Ala 

870 875 ego 

AAA GGT CAT AAT GAA TGC CCA CCT GGA CAC CCT TGT GTT CCT GTT AAA 
Lys Gly Hxs Aen Clu Cys Pro Ala Gly His Ala Cys vll Pro vll Lye 
885 e90 eg5 

GAA CAC CAT TCT TTC ACT CAT CCT TGT GCT GCA GTG GGT GAA TGC TGC 
Clu Asp His cys Phe Thr Hie Pro Cys Ala Ala vll £y Cys ™ 

900 905 9I0 * * 

CCT TCT AAT CAG CAG CCT GTG AAC ACC AAA TGC AAT TCT CAT TCT TAT 
Pro Ser Asn Gin Gin Pro Val Lys Thr Lys Cys £n Ser Asp Ser £yr 
y15 920 925 

TAC CAA CAT AAT TGT GCC AAC ATC ACC TTC ACC TTT AAT AAG GAA ATC 
Tyr Gin Asp Asn Cys Ala Asn lie Thr Phe Thr Phi £n £e Glu Set 
' JU 935 94 0 

£I£ SfJ o CA f TT ACC ACG GAG CAC ATT TGC GAA TTG AGG AAT 

Met Ala Pro Gly Leu Thr Thr Glu His He Cys Ser Glu Leu irg £n 

950 955 * 960 

CTG AAT ATC CTG AAG AAT GTT TCT GCT GAA TAT TCC ATC TAT ATT ACC 
Leu Asn He Leu Lys Asn Val Ser Ala Glu Tyr Ser He lyr lie £hr 
365 970 975 

Vtl 5??° S CT l CA CAC TTC GCA ^ AAT °AA ATA CAT GTT GCT ATT TCT 
Cys Glu Pro Ser His Leu Ala Asn Aen Glu He His Val Ala lie Ser 
sou 985 ~ — 



990 



llZ G^ 1*1 ?T A ^ GAT GAA AAC CCA ATC AAG GAA ATC ACA GAT 

Ala Glu Asp He Cly Glu Asp Glu Aen Pro He Lys Glu lie Thr Asp 
995 1000 J005 



£s He lH A«n tT S T ? AGT AAG CGT GAT GGA AAC AAC ACA CTA ATT 
llto P LeU Val f° r LyB Ar9 As P G1 V Asn Asn Thr Leu He 

XVJ - V 1015 1020 

GCT GCA GTC GCA GAA GTC AGA GTA CAA AGG CCA CCA GTT AAG AAC AAA 
Ala Ala Val Ala Glu yal Arg Val Gin Arg Arg Pro VaT £s Ten 

*030 1035 104C 

Thr A A o £f 7™ S T ? » CA TTA CTG AGC TCA GTC TTA ACA GTA GCC TGC 
Thr Asp Phe Leu Val Pro Leu Leu Ser Ser Val Leu Thr Val Ala Trp 
1045 men 



1015 1020 

AGA GTA CAA AGG CCA < 
Arg Val Gin Arg Arg ] 
1030 1035 1040 

;TC TTA ACA GTA GCC 

— - Ml Leu Thr Val Ala 

10 <5 1050 loss 

nf J CT f 70 C / A ACT GTT TTC TAT TGC TGC ATT CAA AAG CGC AGA 

He Cys cys Leu Val Thr Val Phe Tyr Trp Cys lie Gin Lys Arg Arg 



1060 i6is ~" lilO 

AAG CAG AGC AGC CAT ACT CAC ACA GCA TCT CAT GAC AAC ACC ACC AAC 
Lys Gin Ser ser Hie Thr His Thr Ala Ser Asp Asp Asn Thr Thr £n 
i075 1080 1085 

A^n Sit AGG ^ G CAC 0X0 AAT CAG ATT *** AAC CCC ATA GAG AAA CAC 
YSon 9 CiU Gln Leu Aen Gln Ile L V 8 ABn Pro lie Glu Lye His 
1090 1095 iioO 

AAT A 5 T , GT T CCA ATT *** GAC TAT GAA AAC AAA AAC TCT AAA 
Cly Ala Asn Thr Val Pro Ile Lys Asp Tyr Clu Asn Lye Aen Ser Lys 

1105 1115 1120 



2592 



2640 



2688 



2736 



2784 



2832 



2880 



2928 



2976 



3024 



3072 



3120 



3168 



3216 



3264 



3312 



3360 
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ATC GCC AAA ATA AGO ACG CAC AAT TCA GAA GTG GAG GAA GAT GAC ATG 3406 
lie Ala Lye He Arg Thr His Aen Ser Glu Val Glu Glu Asp Asp Met 
1125 H30 H35 

GAC AAA CAC CAG CAA AAG GCC CGC TTT GCC AAG CAG CCA GCG TAC ACT 34 56 

Asp Lye His Gin Gin Lye Ala Arg Phe Ala Lys Gin Pro Ala Tyr Thr 
1140 H45 H50 

TTG GTA GAC AG A GAT GAA AAG CCA CCC AAC AGC ACA CCC ACA AAA CAC 3 504 

Leu Val Asp Arg Asp Glu Lys Pro Pro Aen Ser Thr Pro Thr Lys His 
1155 H60 H65 

CCA AAC TGG ACA AAT AAA CAG GAC AAC AGA GAC TTG GAA AGT GCA CAA 3552 
Pro Aen Trp Thr Aen Lys Gin Asp Asn Arg Asp Leu Glu Ser Ala Gin 
1170 1175 1180 

AGT TTA AAT AGA ATG GAG TAC ATT GTA 
Ser Leu Aen Arg Met Glu Tyr lie Val 
1185 H90 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1194 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Gin Val Ala Ser Ala Ser Gly Gin Phe Glu Leu Glu He Leu Ser Val 
1 5 10 15 

Gin Asn Val Asn Gly Val Leu Gin Asn Gly Asn Cys Cys Asp Gly Thr 
20 25 30 

Arg Asn Pro Gly Asp Lys Lys Cys Thr Arg Asp Glu Cys Asp Thr Tyr 
35 40 45 

Phe Lye Val Cys Leu Lys Glu Tyr Gin Ser Arg Val Thr Ala Gly Gly 
50 55 60 

Pro Cye Ser Phe Gly Ser Lys Ser Thr Pro Val He Gly Gly Asn Thr 
65 70 75 80 

Phe Aen Leu Lye Tyr Ser Arg Aen Aen Glu Lye Aen Arg He Val He 
85 90 95 

Pro Phe Thr Phe Ala Trp Pro Arg Ser Tyr Thr Leu Leu Val Glu Ala 
100 105 110 

Trp Asp Tyr Asn Asp Aen Ser Thr Aen Pro Asp Arg He He Glu Lye 
US 120 125 

Ala Ser Hie Ser Gly Met He ABn Pro Ser Arg Gin Trp Gin Thr Leu 
130 135 140 

Lye His Aen Thr Gly Ala Ala His Phe Glu Tyr Gin He Arg Val Thr 
145 150 155 160 

Cys Ala Glu His Tyr Tyr Gly Phe Gly Cye Asn Lye Phe Cye Arg Pro 
165 170 175 
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Arg Asp Aep Phe Phe Thr His His Thr Cys Asp Gin Asn Gly Asn Lys 
180 185 190 

Thr Cys Leu Glu Gly Trp Thr Gly Pro Glu Cys Asn Lys Ala He Cys 
195 200 205 

Arg Gin Gly Cye Ser Pro Lys His Gly Ser Cys Thr Val Pro Gly Glu 
210 215 220 

Cys Arg Cys Gin Tyr Gly Trp Gin Gly Gin Tyr Cys Asp Lys Cys He 
225 230 235 240 

Pro His Pro Gly Cys Val His Gly Thr Cys He Glu Pro Trp Gin Cys 
245 250 255 

Leu Cys Glu Thr Asn Trp Gly Gly Gin Leu Cys Asp Lys Asp Leu Asn 
260 265 270 

Tyr Cys Gly Thr His Pro Pro Cys Leu Asn Gly Gly Thr Cye Ser Asn . 
275 280 285 

Thr Gly Pro Aep Lys Tyr Gin Cys Ser Cys Pro Glu Gly Tyr Ser Gly 
290 295 300 

Gin Asn Cys Glu He Ala Glu His Ala Cys Leu Ser Asp Pro Cys His 
305 310 315 320 

Asn Gly Gly Ser Cys Leu Glu Thr Ser Thr Gly Phe Glu Cys Val Cys 
325 330 335 

Ala Pro Gly Trp Ala Gly Pro Thr Cys Thr Asp Asn He Aop Asp Cye 
340 345 350 

Ser Pro Asn Pro Cys Gly His Gly Gly Thr Cys Gin Asp Leu Val Asp 
355 360 365 

Gly Phe Lys Cys He Cys Pro Pro Gin Trp Thr Gly Lys Thr Cys Gin 
370 375 380 

Leu Asp Ala Asn Glu Cys Glu Gly Lys Pro Cys Val Asn Ala Asn Ser 
385 390 395 400 

Cys Arg Asn Leu He Gly Ser Tyr Tyr Cys Asp Cys lie Thr Gly Trp 
405 410 415 

Ser Gly His Asn Cys Asp He Asn He Asn Asp Cys Arg Gly Gin Cys 
420 425 430 

Gin Asn Gly Gly Ser Cys Arg Asp Leu Val Asn Gly Tyr Arg Cys He 
435 440 445 

Cys Ser Pro Gly Tyr Ala Gly Asp His Cys Glu Lys Asp He Asn Glu 
450 455 460 

Cys Ala Ser Asn Pro Cys Met Asn Gly Gly His Cys Gin Asp Glu He 
465 470 475 480 

Asn Gly Phe Gin Cys Leu Cys Pro Ala Gly Phe Ser Gly Asn Leu CyB 
485 490 495 

Gin Leu Aep He Asp Tyr Cys Glu Pro Asn Pro Cys Gin Asn Gly Ala 
500 505 510 

Gin Cys Phe Asn Leu Ala Met ABp Tyr Phe Cys Asn Cys Pro Glu Asp 
515 520 525 

Tyr Glu Gly Lys Asn Cys Ser His Leu Lys Asp His Cys Arg Thr Thr 
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530 535 540 

Pro Cye Glu Val He Aep Ser Cys Thr Val Ala Val Ala Ser Aon Ser 
545 550 555 560 

Thr Pro Glu Gly Val Arg Tyr He Ser Ser Asn Val Cys Gly Pro His 
565 570 575 

Gly Lys Cys Lys Ser Gin Ala Gly Gly Lye Phe Thr Cye Glu Cye Asn 
580 585 590 

Lys Gly Phe Thr Gly Thr Tyr Cye His Glu Asn He Asn Asp Cys Glu 
595 600 605 

Ser Asn Pro Cys Lys Asn Gly Gly Thr Cys He Asp Gly Val Asn Ser 
610 615 620 

Tyr Lys Cys He Cys Ser Asp Gly Trp Glu Gly Thr Tyr Cys Glu Thr 
625 630 635 640 

Asn He Asn Asp Cys Ser Lys Asn Pro Cys His Asn Gly Gly Thr Cys 
645 650 655 

Arg Asp Leu Val Asn Asp Phe Phe Cys Glu Cys Lye Asn Gly Trp Lye 
660 665 670 

Gly Lye Thr Cys His Ser Arg Asp Ser Gin Cys Asp Glu Ala Thr Cys 
675 680 685 

Asn Asn Gly Gly Thr Cys Tyr Asp Glu Gly Asp Thr Phe Lys CyB Met 
690 695 700 

Cye Pro Ala Gly Trp Glu Gly Ala Thr Cys Asn He Ala Arg Asn Ser 
705 710 715 720 

Ser Cys Leu Pro Asn Pro Cys Hie Asn Gly Gly Thr Cys Val Val Ser 
725 730 735 

Gly Asp Ser Phe Thr Cys Val Cys Lye Glu Gly Trp Glu Gly Pro Thr 
740 745 750 

Cye Thr Gin Asn Thr Aen Asp Cye Ser Pro His Pro Cys Tyr Asn Ser 
755 760 765 

Gly Thr Cys Val Asp Gly Asp Asn Trp Tyr Arg Cys Glu Cys Ala Pro 
770 775 780 

Gly Phe Ala Gly Pro Asp Cys Arg He Asn He Asn Glu Cys Gin Ser 
785 790 795 800 

Ser Pro Cys Ala Phe Gly Ala Thr Cys Val Asp Glu He Asn Gly Tyr 
805 810 815 

Arg Cys He Cys Pro Pro Gly Arg Ser Gly Pro Gly Cys Gin Glu val 
820 825 830 

Thr Gly Arg Pro Cys Phe Thr Ser He Arg Val Met Pro Asp Gly Ala 
63' 840 845 

Lys Trp ABp Asp Aep Cys Asn Thr Cys Gin Cys Leu Asn Gly Lys Val 
850 855 860 

Thr Cys Ser Lys Val Trp Cys Gly Pro Arg Pro Cys He He His Ala 
865 870 875 880 

Lye Gly Hie Asn Glu Cye Pro Ala Gly Hie Ala Cys Val Pro Val Lys 
885 890 89L 
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Glu Aep His Cys Phe Thr His Pro Cys Ala Ala Val Gly Glu Cys Trp 
900 905 910 Y P 

Pro Ser Asn Gin Gin Pro Val Lys Thr Lys Cys Asn Ser Asp Ser Tyr 
915 920 925 

Tyr Gin Asp Asn Cys Ala Asn He Thr Phe Thr Phe Asn Lye Glu Met 
' JO 935 940 

Met Ala Pro Gly Leu Thr Thr Glu His lie Cys Ser Glu Leu Arg Asn 
945 950 955 * 9 6 o 

Leu Asn He Leu Lys Asn Val Ser Ala Glu Tyr Ser He Tyr He Thr 
965 970 9?5 

Cys Glu Pro Ser His Leu Ala Asn Asn Glu He His Val Ala He Ser 
980 985 990 

Ala Glu Asp He Gly Glu Asp Glu Asn Pro He Lys Glu He Thr Asp 
SSb 1000 1005 

LyB ?oV le ASP LeU Val ?S r Ly8 Arg A °P G1 y Aon Afln Leu He 

1010 1015 1020 

Ala Ala Val Ala Glu Val Arg Val Gin Arg Arg Pro Val Lys Asn Lys 
1025 1030 1035 1040 

Thr Asp Phe Leu Val Pro Leu Leu Ser Ser Val Leu Thr Val Ala Tro 
1045 1050 1055 

He Cys Cys Leu Val Thr Val Phe Tyr Trp Cys He Gin Lys Arg Arg 
1060 1065 1070 

Lys Gin Ser Ser His Thr His Thr Ala Ser Aep Asp Asn Thr Thr Asn 
1075 1080 1085 

ABn YXL Arg C1U Gln Leu Aen Gln Ile A8n p ro Glu Lys His 

1090 1095 iioo 

Gly Ala Asn Thr Val Pro He Lys Asp Tyr Glu Asn Lys Asn Ser Lys 
11U5 1115 1120 

He Ala Lys He Arg Thr His Asn Ser Glu Val Glu Glu Asp A 8 p Met 
H25 H30 " ii3 5 

Asp Lys His Gln Cln Lys Ala Arg Phe Ala Lys Gln Pro Ala Tyr Thr 
H40 1145 iiso 

Leu Val Aep Arg Asp Glu Lys Pro Pro Asn Ser Thr Pro Thr Lvs His 
H55 H60 H65 

Pr ° A ?" Trp Tnr Asn L y* Gln A °P A °n Arg Asp Leu Glu Ser Ala Gln 
1170 1175 1180 

Ser Leu Asn Arg Met Glu Tyr He Val 
1185 H90 

(2) INFORMATION FOR SEQ ID NO: 7: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 236 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
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Met Hia Trp He Lys Cye Leu Leu Thr Ala Phe He Cys Phe Thr Val 
1 5 10 15 

He Val Gin Val His Ser Ser Gly Ser Phe Glu Leu Arg Leu Lye Tvr 
20 25 30 

Phe Ser Aon Aap Hie Gly Arg Asp Aon Glu Gly Arg Cye Cys Ser Gly 
35 40 45 

Glu Ser Aep Gly Ala Thr Gly Lys Cys Leu Gly Ser Cye Lys Thr Arg 
SO 55 60 

Phe Arg Val Cys Leu Lys Hie Tyr Gin Ala Thr He Aap Thr Thr Ser 
65 70 75 80 

Gin Cye Thr Tyr Gly Asp Val He Thr Pro He Leu Gly Glu Aen Ser 
85 90 95 

Val Asn Leu Thr Asp Ala Gin Arg Phe Gin Asn Lys Cly Phe Thr Asn 
100 105 no 

Pro He Gin Phe Pro Phe Ser Phe Ser Trp Pro Gly Thr Phe Ser Leu 
115 120 125 

He Val Glu Ala Trp His Asp Thr Asn Asn Ser Gly Asn Ala Arg Thr 
130 135 140 

Asn Lys Leu Leu He Gin Arg Leu Leu Val Gin Gin Val Leu Glu Val 
145 150 155 160 

Ser Ser Glu Trp Lys Thr Asn Lys Ser Glu Ser Gin Tyr Thr Ser Leu 
165 170 175 

Glu Tyr Asp Phe Arg Val Thr Cys Asp Leu Asn Tyr Tyr Gly Ser Gly 
180 185 190 

Cys Ala Lys Phe Cys Arg Pro Arg Asp Asp Ser Phe Gly His Ser Thr 
195 200 205 

Cys Ser Glu Thr Gly Glu He He Cys Leu Thr Cly Trp Gin Gly Asp 
210 215 220 

Tyr Cye His He Pro Lys Cys Ala Lys Gly Cys Glu 
225 230 235 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1405 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

Met Phe Arg Lye His Phe Arg Arg Lys Pro Ala Thr Ser Ser Ser Leu 
IS 10 15 

Glu Ser Thr He Glu Ser Ala Asp Ser Leu Gly Met Ser Lye Lys Thr 
20 25 30 

Ala Thr Lys Arg Gin Arg Pro Arg His Arg Val Pro Lys lie Ala Thr 
35 40 45 

Leu Pro Ser Thr He Arg ABp Cys Arg Ser Leu Lys Ser Ala Cys Asn 
50 55 60 
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L eu lie Ala Leu He Leu He Leu Leu Val His Lys He Ser Ala Ala 

65 70 

Gly ABn Phe Glu Leu Glu lie Leu Glu lie Ser Asn Thr Aen Ser His 
85 

Leu Leu Asn 01, Tyr Cys Cys Gly Met Pro Ala Glu U» Arg Ala Thr 

100 AU=> 
Lys Thr 11- Gly Cys Ser Pro Cys Thr Thr Ala Phe Arc Leu Cys Leu 

Lys Glu Tyr Gin Thr Thr Glu Gin Gly Ala Ser lie Ser Thr Gly Cys 

130 135 
Ser Phe Gly Asn Al. Thr Thr Lys lie Leu Gly Gly Ser Ser Phe Val 
14S 150 " 5 

Leu ser Asp Pro Gly Val Gly Al. Ue Val Leu Pro Phe Thr Phe Arc 

165 170 
Trp Thr Lys Ser Phe Thr Leu lie Leu Gin Ala Leu Asp Met Tyr Asn 



16S 

Thr Leu lie 
180 I 85 
Thr Ser Tyr Pro Asp Al. Glu Arg Leu lie Glu Glu Thr Ser Tyr Ser 



195 2°° 
Cly V.l lie Leu Pro Ser Pro Glu Trp Lys Thr Leu Asp His lie Gly 

Arg Asn Ala Arg He Thr Tyr Arg Val Arg V.l Gin Cys Ala val Thr 
225 230 " 5 

Tyr Tyr Asn Thr Thr Cys Thr Thr Phe Cys Arg Pro Arg Asp Asp Gin 

Phe Gly His Tyr Al. Cys Gly Ser Glu Gly Gin Lys Leu Cys Leu Asn 

260 265 
Cly Trp Gin Gly V.l Asn Cys Glu Glu Al. He Cys Lys Ala Gly Cys 

275 280 
Asp Pro val His Gly Lys Cys Asp Arg Pro Gly Glu Cys Glu Cys Arg 

290 295 
Pro Gly Trp Arg Gly Pro Leu Cys Asn Glu Cys Met V.l Tyr Pro Gly 



305 



310 



cys Lys His Gly Ser Cys Asn Cly Ser Ala Trp Lys Cys val Cys Asp 



325 



Thr A-n Trp Gly Gly lie Leu Cys Asp Gin Asp Leu Asn Phe Cys Gly 



340 



Thr His Glu Pro Cys Lys His Gly Gly Thr Cys Glu ».» Thr Ala Pro 



355 



Asp Lys Tyr Arg Cys Thr Cys Al. Glu Gly Leu Se^ Gly Glu Cln Cys 

370 375 

Glu lie Val Glu Hie Pro Cys Ala Thr Arg Pro Cye Arg Asn Gly Gly 

385 390 

Th r cys Thr Leu Lys Thr Ser Asn Arg Thr Cln Al. Gin Val Tyr Arg 

405 * XKJ 
Thr ser Hie Gly Arg ser Aen Met Gly Arg Pro V.l Arg Arg Ser Ser 
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420 425 430 

Ser Met Arg Ser Leu Asp His Leu Arg Pro Glu Gly Gin Ala Leu Asn 
435 440 445 

Gly Ser Ser Ser Ser Gly Leu Val Ser Leu Gly Ser Leu Gin Leu Gin 
450 455 460 

Gin Gin Leu Ala Pro Aep Phe Thr Cys Asp Cys Ala Ala Gly Trp Thr 
465 470 475 480 

Gly Pro Thr Cys Glu lie Aen lie Asp Glu Cys Ala Gly Gly Pro Cye 
485 490 495 

Glu His Gly Gly Thr Cys lie Asp Leu lie Gly Gly Phe Arg Cys Glu 
500 505 510 

Cys Pro Pro Glu Trp His Gly Asp Val Cys Gin Val Aep Val Asn Glu 
515 520 525 

Cys Glu Ala Pro His Ser Ala Gly lie Ala Ala Asn Ala Leu Leu Thr 
530 535 540 

Thr Thr Ala Thr Ala lie He Gly Ser Asn Leu Ser Ser Thr Ala Leu 
545 550 555 560 

Leu Ala Ala Leu Thr Ser Ala Val Ala Ser Thr Ser Leu Ala He Gly 
S65 570 575 

Pro Cys He Asn Ala Lys Glu Cys Arg Asn Gin Pro Gly Ser Phe Ala 
580 585 590 

Cys He Cys Lys Glu Gly Trp Gly Gly Val Thr Cys Ala Glu Asn Leu 
595 600 605 

Asp Asp Cys Val Gly Gin Cys Arg Asn Gly Ala Thr Cys He Asp Leu 
610 615 620 

Val Asn Asp Tyr Arg Cys Ala Cys Ala Ser Gly Phe Thr Gly Arg Asp 
625 630 635 640 

Cys Glu Thr Asp He Asp Glu Cys Ala Thr Ser Pro Cys Arg Asn Gly 
645 650 655 

Gly Glu Cys Val Asp Met Val Gly Lys Phe Asn Cys He Cys Pro Leu 
660 665 670 

Gly Tyr Ser Gly Ser Leu Cys Glu Glu Ala Lys Glu Aen Cye Thr Pro 
675 680 685 

Ser Pro CyB Leu Glu Gly His Cye Leu Asn Thr Pro Glu Gly Tyr Tyr 
690 695 700 

Cys His Cys Pro Pro Asp Arg Ala Gly Lys His Cys Glu Gin Leu Arg 
705 710 715 720 

Pro Leu Cys Ser Gin Pro Pro Cys Asn Glu Gly Cys Phe Ala Asn Val 
725 730 735 

Ser Leu Ala Thr Ser Ala Thr Thr Thr Thr Thr Thr Thr Thr Thr Ala 
740 745 750 

Thr Thr Thr Arg Lys Met Ala Lys Pro Ser Gly Leu Pro Cys Ser Gly 
755 760 765 

Hie Gly Ser Cys Glu Met Ser Asp Val Gly Thr Phe Cys Lys Cys His 
770 775 780 
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Val Gly His Thr Gly Thr Phe Cys Glu Hie Asn Leu Asn Glu Cys Ser 
785 790 795 800 

Pro Asn Pro Cye Arg Asn Gly Gly lie Cys Leu Asp Gly Asp Gly Asp 
80S 810 1 

Phe Thr Cys Glu Cye Met Ser Gly Trp Thr Gly Lys Arg Cys Ser Glu 
820 825 830 

Arg Ala Thr Gly Cys Tyr Ala Gly Gin Cye Gin Asn Gly Gly Thr Cys 
835 840 845 

Met Pro Gly Ala Pro Asp Lys Ala Leu Gin Pro His Cye Arg Cys Ala 
***** fit* 860 



850 



Pro Gly Trp Thr Gly Leu Phe Cye Ala Glu Ala lie Aep Gin Cys Arg 
865 870 875 880 

Gly Gin Pro Cys His Asn Gly Gly Thr Cye Glu Ser Gly Ala Gly Trp 
* 885 890 895 

Phe Arg Cys Val Cys Ala Gin Gly Phe Ser Gly Pro Asp Cys Arg lie 

Asn Val Asn Glu Cys Ser Pro Gin Pro Cye Gin Gly Gly Ala Thr Cys 
- oon 925 



915 



lie Asp Gly lie Gly Gly Tyr Ser Cys He Cys Pro Pro Gly Arg His 

935 940 



930 



Gly Leu Arg Cys Glu lie Leu Leu Ser Asp Pro Lys Ser Ala Cys Gin 

- - - 950 955 



945 



Asn Ala Ser Asn Thr lie Ser Pro Tyr Thr Ala Leu Asn Arg Ser Gin 

965 970 975 

Asn Trp Leu Asp He Ala Leu Thr Gly Arg Thr Glu Asp Asp Glu Asn 

980 9 fi5 9 

Cys Asn Ala Cys Val Cys Glu Asn Gly Thr Ser Arg Cys Thr Asn Leu 



995 



Trp Cye Gly Leu Pro Asn Cys Tyr Lys Val Asp Pro Leu Ser Lys Ser 
1010 1015 1020 

Ser Asn Leu Ser Gly Val Cys Lys Gin Hie Glu Val Cys Val Pro Ala 
I0 2S - 1030 1035 

Leu Ser Glu Thr Cys Leu Ser Ser Pro Cys Asn Val Arg Gly Asp Cys 
1045 1050 av« 

Arg Ala Leu Glu Pro Ser Arg Arg Val Ala Pro Pro Arg Leu Pro Ala 

1060 1065 iutKj 

Ly8 Ser Ser Cyo Trp Pro Aen Gin Ala Val Val Am Glu Asn Cys Ala 

J 2080 lues 



1075 



Arg Leu Thr lie Leu Leu Ala Leu Glu Arg Val Gly Lys Gly Ala Ser 
* 1090 1095 1100 

val Glu Gly Leu Cys Ser Leu Val Arg Val Leu Leu Ala Ala Gin Leu 
U05 1110 HI* 

lie Lys Lys Pro Ala Ser Thr Phe Gly Gin Asp Pro Gly Met Leu Met 

1125 ll 30 
Val Leu Cys Asp Leu Lys Thr Gly Thr Asn Asp Thr Val Glu Leu Thr 
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1150 

Val Ser Ser ser Lye Leu Aen Asp Pro Gin Leu Pro Val Al. val Gly 

1160 1165 

Leu Hgcly Cl« Leu Leu Ser Ser Arg Gin Leu Aen Gly „. cln Arg 

-ia/d 1180 

Ar^Lye Glu Leu Clu Leu Cln Hie Ala Lye Leu Al. Al. Leu Thr Ser 

1195 1200 
lie val Clu val Lye Leu clu Thr Ala Arg Val Ala Aep cly Ser Cly 

1210 12 is 
Hie ser Leu Leslie Cly Val Leu Cye Gly Val Phe He Val Leu V.l 

1225 1230 
Cly Phe se^v.l Phe lie Ser Leu^yr Trp Lye Gin Arg^Leu Ala Tyr 

Arg Th^Ser Ser Gly Met Ae^Leu Thr Pro Ser Le^Aep Ala Leu Arc 

Hi^Glu Glu Glu Lye Serpen Aen Leu Gin Ae^Glu Glu Aen Leu Arg 

Arg Tyr Thr Aen Pr^Leu Lye Cly Ser Th^Ser Ser Leu Arg Ala Ala 

Thr Cly Met Glu Leu Ser Leu Aen Pro Ala Pro Clu Leu Al. Ala Ser 

1306 1310 

Al. Ala Ser ser Ser Al. Leu Hie Arg Ser Gin Pro Leu Phe Pro Pro 

1320 1325 
Cy. Asp^Pne Glu Arg Glu Le^Asp Ser Ser Thr Gly^Leu Lys Gin Ala 

His^Lys Arg Ser Ser Gl^Ile Leu Leu His Lysjhr Gin Aen Ser Asp 

Met Arg Lys Aen Thr Val Gly Ser Leu Asp Ser Pro Arg Lys Asp Phe 
1365 1370 1375 

Gly Lys Arg Ser lie Aen Cys Lys Ser Met Pro Pro Ser Ser Gly Asp 
1380 1385 1390 

Clu Gly ser Asp Val Leu Ala Thr Thr Val Met Val 
1395 1400 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NES S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME/KEY: modified base 

(B) LOCATION: 3 

(D) OTHER INFORMATION: /mod_baee« i 
(ix) FEATURE: 
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(A) NAME/KEY: modified base 

(B) LOCATION: 12 

(D) OTHER INFORMATION: /mod_base« i 

(ix) FEATURE : 

(A) NAME/KEY: modified base 

(B) LOCATION: 18 

(D) OTHER INFORMATION: /mod base- i 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CGNYTTTCCY TNAAR SANTA YCA 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



( XX ) FEATURE : 

(A) NAME/KEY: Modif ied-s ite 

(B) LOCATION: 6 

(D) OTHER INFORMATION: /label" A 
/note s "X*histidine or glutamic acid" 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Arg Leu Cys Cys Lys Xaa Tyr Gin 
1 5 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



( ix ) FEATURE : 

(A) NAME /KEY : modified base 

(B) LOCATION: 3 

(D) OTHER INFORMATION: /mod_base» i 

( ix ) FEATURE : 

(A) NAME /KEY : modified base 

(B) LOCATION: 9 

(D) OTHER INFORMATION: /mod_base= i 

( ix ) FEATURE : 

(A) NAME/KEY: modified base 

(B) LOCATION: 12 

(D) OTHER INFORMATION: /mod_baee« i 

(ix) FEATURE : 

(A) NAME/KEY: modified base 

(B) LOCATION: 15 

(D) OTHER INFORMATION: /mod base- i 
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<xi> SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

TCNATGCANG TNCCNCCRTT 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 amino acids 
<B) TYPE: amino acid 
(C) STRANDEDNESS : 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Ann Gly Gly Thr Cye lie Asp 
1 5 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 163 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 2.. 163 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

G TCC CGC CTC ACT CCC GGG GGA CCC TGC AGC TTC GGC TCA GGG TCT 46 
Ser Arg Val Thr Ala Gly Gly Pro Cya Ser Phe Gly Ser Gly Ser 
1 5 10 is 

ACG CCT GTC ATC GGG GGT AAC ACC TTC AAT CTC AAG GCC AGC CGT GGC 94 
Thr Pro Val He Gly Gly Ann Thr Phe Asn Leu Lye Ala Ser Arg Gly 
20 25 30 

AAC CAC CCT AAT CGC ATC CTA CTG CCT TTC AGT TTC ACC TGC CCG AGG 142 
Aon Asp Arg Asn Arg He Val Leu Pro Phe Ser Phe Thr Trp Pro Arg 
35 40 45 

TCC TAC ACT TTG CTC GTG GAG 163 
Ser Tyr Thr Leu Leu Val Glu 
50 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Ser Arg Val Thr Ala Gly Gly Pro Cye Ser Phe Gly Ser Gly Ser Thr 
1 5 10 15 

Pro Val He Gly Gly Asn Thr Phe Asn Leu Lys Ala Ser Arg Gly Aen 
20 25 30 

Aep Arg Ann Arg He Val Leu Pro Phe Ser Phe Thr Trp Pro Arc Ser 
35 40 45 

Tyr Thr Leu Leu Val Glu 
50 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 135 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE : 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..135 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TCT TCT AAC GTC TGT GGT CCC CAT GGC AAG TGC AAG AGC CAG TCG GCA 48 
Ser Ser Aen Val Cys Gly Pro His Gly Lys Cys Lys Ser Gin Ser Ala 
15 10 15 

GGC AAA TTC ACC TGT GAC TGT AAC AAA GGC TTC ACC GGC ACC TAC TGC 96 
Gly Lys Phe Thr Cys Asp Cys Asn Lys Gly Phe Thr Gly Thr Tyr Cye 
20 25 30 

CAT GAA AAT ATC AAC GAC TGC GAG AGC AAC CCC TGT AAA 135 
Hie Glu Asn He Asn Asp Cys Glu Ser Asn Pro Cys Lys 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Ser Ser Aen Val Cye Gly Pro His Gly Lys Cys Lye Ser Gin Ser Ala 
1 5 10 15 

Gly Lye Phe Thr Cys Asp Cys Asn Lys Gly Phe Thr Gly Thr Tyr Cys 
20 25 30 

His Glu Asn He ABn Asp Cys Glu Ser Asn Pro Cys Lys 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 17: 
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(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH; 23 baee pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME /KEY : modified base 
(BJ LOCATION: 3 

(D) OTHER INFORMATION: /mod_base« i 
<ix) FEATURE: 

(A) NAME/KEY: modified base 

(B) LOCATION: 6 

(D> OTHER INFORMATION: /mod_base- i 
(ix) FEATURE : 

(A) NAME/KEY: modified base 

(B) LOCATION: 12 

(D) OTHER INFORMATION: /mod_base« i 
(ix) FEATURE: 

(A) NAME /KEY : modified base 

(B) LOCATION: 18 

(D) OTHER INFORMATION: /mod_baoe= i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: 
CGNYTNTGCY TNAARSANTA YCA 

23 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME /KEY : Modif ied-site 

(B) LOCATION: 6 

(D) OTHER INFORMATION: /label- A 
/note- "X-glutamic acid or histidine" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 
Arg Leu Cys Leu Lys Xaa Tyr Gin 

1 c 
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5 2. The protein of claim 1 which is a human 



WHAT IS n .A IMED J fi ; 

1. A purified vertebrate Serrate protein. 

2. The protein of claim 

protein. 

3. The protein of claim l which is a ^mmalian 

protein. 

10 

4. The protein of claim 2 which comprises the 
amno acid sequence substantially as set forth in amino acid 
numbers 30 - 1218 of SEQ id NO: 2. 

15 5 - The P™*ein of claim 2 which comprises the 

am "° aCld se * ue "« substantially as set forth in amino acid 
numbers 1 - 125 7 of SEQ ID NO: 4. 

6 - A P^ified human protein encoded by a nucleic 
20 acxd hybr.dizable to plasmid SerFL or the Serrate sequence 
therein as deposited with the ATCC and assigned accession 
number 68876. 

7. The protein of claim 2 which is encoded by 
25 plasmid pBS39 as deposited with the ATCC and assigned 

accession number 97068. 

8. The protein of claim 2 which comprises the 
Serrate amino acid sequence encoded by plasmid P BS15 as 

30 deposited with the ATCC and assigned accession number 



35 



9. The protein of claim 2 which comprises the 
Serrate amino acid sequence encoded by plasmid pBS3-2 as 
deposited with the ATCC and assigned accession number 
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10. A purified fragment of the protein of claim i, 
which is able to display one or more functional activities of 
a Serrate protein. 

5 11. A purified fragment of the protein of claim 2, 

which is able to display one or more functional activities of 
a human or D, melanogaster Serrate protein. 

12. A purified fragment of the protein of claim 2 
10 or 7, which is able to be bound by an antibody directed 

against a human Serrate protein. 

13. A molecule comprising the fragment of claim 

10. 

15 

14. A purified fragment of a vertebrate Serrate 
protein comprising a domain of the protein selected from the 
group consisting of the extracellular domain, DSL domain, 
epidermal growth factor-like repeat domain, cysteine-rich 

20 domain, transmembrane domain, and intracellular domain. 

15. A purified fragment of a vertebrate Serrate 
protein comprising the DSL domain of the protein. 

25 16. A purified fragment of a vertebrate Serrate 

protein comprising an epidermal growth factor-homologous 
repeat of the protein. 

17. The fragment of claim 14 in which the Serrate 
30 protein is a human Serrate protein. 

18. A purified fragment of a vertebrate Serrate 
protein comprising a region homologous to a Notch protein or 
a Delta protein, and consisting of at least ten amino acids. 

35 

19. A chimeric protein comprising a fragment of a 
vertebrate Serrate protein consisting of at least ten amino 
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acids fused via a covalent bond to an amino acid sequence of 
a second protein, in which the second protein is not a 
Serrate protein. 

5 20. The chimeric protein of claim 19 in which the 

fragment of a Serrate protein is a fragment capable of being 
bound by an anti-Serrate antibody. 

21. The chimeric protein of claim 19 in which the 
10 Serrate protein is a human protein. 

22. The chimeric protein of claim 19 which is able 
to display one or more functional activities of a Serrate 
protein. 

15 

23. A purified fragment of a vertebrate Serrate 
protein which fragment (a) is capable of being bound by an 
anti-Serrate antibody; (b) lacks the transmembrane and 
intracellular domains of the protein; and (c) consists of at 

20 least ten amino acids of the Serrate protein. 

24. A purified fragment of a vertebrate Serrate 
protein which fragment (a) is capable of being bound by an 
anti-Serrate antibody; (b) lacks the extracellular domain of 

25 the protein; and (c) consists of at least ten amino acids of 
the Serrate protein. 

25. A purified fragment of a vertebrate Serrate 
protein which is able to bind to a Notch protein. 

30 

26. The fragment of claim 25, which lacks the 
epidermal growth factor-like repeats of the Serrate protein. 

27. The fragment of claim 23, 24, 25 or 26 in 
35 which the Serrate protein is a human Serrate protein. 
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28. The fragment of claim 29, which is a fragment 
of SEQ ID NO: 2 or SEQ ID NO: 4. 

29. A molecule comprising the fragment of claim 

5 25. 

30. An antibody which is capable of binding the 
Serrate protein of claim 1 and which does not bind a 
Drosophila Serrate protein. 

10 

31. An antibody which is capable of binding the 
Serrate protein of claim 2 and which does not bind a 
Drosophila Serrate protein. 

32. The antibody of claim 30 which is monoclonal. 

33. A molecule comprising a fragment of the 
antibody of claim 32, which fragment is capable of binding a 
vertebrate Serrate protein. 

20 

34. An isolated nucleic acid comprising a 
nucleotide sequence encoding a vertebrate Serrate protein. 

35. The nucleic acid of claim 34 which is DNA. 

25 

36. An isolated nucleic acid comprising a 
nucleotide sequence absolutely complementary to the 
nucleotide sequence of claim 34. 

30 37. An isolated nucleic acid comprising a 

nucleotide sequence encoding the Serrate protein of claim 2. 

38. An isolated nucleic acid comprising the 
Serrate coding sequence contained in plasmid pBS39 as 
35 deposited with the ATCC and assigned accession number 97068. 
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39. An isolated human nucleic acid hybridizable to 
plasmid SerFL or the Serrate sequence therein as deposited 
with the ATCC and assigned accession number 68876. 

5 40. An isolated nucleic acid comprising the 

Serrate coding sequence contained in plasmid pBS3-2 as 
deposited with the ATCC and assigned accession number 



41, An isolated nucleic acid comprising the 
10 Serrate coding sequence contained in plasmid pBSlS as 
deposited with the ATCC and assigned accession number 



42. An isolated nucleic acid comprising a 
nucleotide sequence encoding a protein, said protein 

15 comprising amino acid numbers l - 1257 of SEQ id NO: 4. 

43. An isolated nucleic acid comprising a fragment 
of a vertebrate Serrate gene consisting of at least 8 
nucleotides. 



20 



44. An isolated nucleic acid comprising a 
nucleotide sequence encoding the fragment of claim 14, 15, ie 
or 25. 



25 45 - The nucleic acid of claim 44 in which the 

fragment is a fragment of a human Serrate protein. 

46. An isolated nucleic acid comprising a 
nucleotide sequence encoding the fragment of claim 12. 

30 

47. An isolated nucleic acid comprising a 
nucleotide sequence encoding a protein, said protein 
comprising amino acid numbers 30 - 1218 of SEQ ID NO: 2. 

35 48 * An isolated nucleic acid comprising a 

nucleotide sequence encoding the protein of claim 21. 
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49 . 



A recombinant cell containing the nucleic acid 



of claim 34, 37 or 43. 



50. 



A recombinant cell containing the nucleic acid 



5 of claim 38, 40 or 41. 



51. 



A method of producing a Serrate protein 



comprising growing a recombinant cell containing the nucleic 
acid of claim 34 or 37 such that the encoded Serrate protein 
10 is expressed by the cell, and recovering the expressed 
Serrate protein. 



comprising growing a recombinant cell containing the nucleic 
15 acid of claim 38, 40 or 41 such that the encoded Serrate 
protein is expressed by the cell, and recovering the 
expressed Serrate protein. 

53 . A method of producing a Serrate protein 

20 comprising growing a recombinant cell containing the nucleic 
acid of claim 45 such that the encoded protein is expressed 
by the cell, and recovering the expressed protein. 

54 . A method of producing a protein comprising a 
25 fragment of a Serrate protein, which method comprises growing 

a recombinant cell containing the nucleic acid of claim 46 
such that the encoded protein is expressed by the cell, and 
recovering the expressed protein. 

30 55. The product of the process of claim 51. 



52. 



A method of producing a Serrate protein 



56 . 



The product of the process of claim 52. 



57 . 



The product of the process of claim 53. 



35 



58. 



The product of the process of claim 54. 
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59. A pharmaceutical composition comprising a 
therapeutically effective amount of a vertebrate Serrate 
protein; and a pharmaceutically acceptable carrier. 

5 60. The composition of claim 59 in which the 

Serrate protein is a human Serrate protein. 

61. A pharmaceutical composition comprising a 
therapeutically effective amount of the fragment of claim 14, 

10 15, 16 or 25; and a pharmaceutically acceptable carrier. 

62. A pharmaceutical composition comprising a 
therapeutically effective amount of the fragment of claim 12; 
and a pharmaceutically acceptable carrier. 

15 

63. A pharmaceutical composition comprising a 
therapeutically effective amount of a molecule comprising a 
fragment of a vertebrate Serrate protein, which derivative or 
analog is characterized by the ability to bind to a Notch 

20 protein or to a molecule comprising the epidermal growth 
factor-like repeats 11 and 12 of a Notch protein. 

64. A pharmaceutical composition comprising a 
therapeutically effective amount of the nucleic acid of claim 

25 34, 36 or 37; and a pharmaceutically acceptable carrier. 

65. A pharmaceutical composition comprising a 
therapeutically effective amount of the nucleic acid of claim 
44; and a pharmaceutically acceptable carrier. 

30 

66. A pharmaceutical composition comprising a 
therapeutically effective amount of the nucleic acid of claim 
46; and a pharmaceutically acceptable carrier. 

35 67 . A pharmaceutical composition comprising a 

therapeutically effective amount of the antibody of claim 30; 
and a pharmaceutically acceptable carrier. 
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68. A pharmaceutical composition comprising a 
therapeutically effective amount of a fragment or derivative 
of the antibody of claim 30 containing the binding domain of 
the antibody; and a pharmaceutical^ acceptable carrier. 

5 

69. A method of treating or preventing a disease 
or disorder in a subject comprising administering to a 
subject in which such treatment or prevention is desired a 
therapeutically effective amount of a vertebrate Serrate 

10 protein or derivative thereof which is able to bind to a 
Notch protein. 



70. The method according to claim 69 in which the 
disease or disorder is a malignancy characterized by 
15 increased Notch activity or increased expression of a Notch 
protein or of a Notch derivative capable of being bound by an 
anti-Notch antibody, relative to said Notch activity or 
expression in an analogous non-malignant sample. 

20 71. The method according to claim 69 in which the 

disease or disorder is selected from the group consisting of 
cervical cancer, breast cancer, colon cancer, melanoma, 
seminoma, and lung cancer. 

25 72. The method according to claim 69 in which the 

subject is a human. 

73. The method according to claim 69 in which the 
Serrate protein is a human Serrate protein. 

30 

74. A method of treating or preventing a disease 
or disorder in a subject comprising administering to a 
subject in which such treatment or prevention is desired a 
therapeutically effective amount of a molecule, in which the 

35 molecule is an oligonucleotide which (a) comprises ten 
nucleotides; (b) comprises a sequence absolutely 
complementary to an at least ten nucleotide portion of an RNA 
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transcript specific to a vertebrate Serrate gene; and (c) is 
hybridizable to the RNA transcript. 

75. A method of treating or preventing a disease 
5 or disorder in a subject comprising administering to a 

subject in which such treatment or prevention is desired an 
effective amount of the nucleic acid of claim 34, 37 or 46. 

76. A method of treating or preventing a disease 
10 or disorder in a subject comprising administering to a 

subject in which such treatment or prevention is desired an 
effective amount of the antibody of claim 32. 

77. The method according to claim 73 in which the 
15 disease or disorder is a disease or disorder of the central 

nervous system. 

78. An isolated oligonucleotide comprising ten 
nucleotides, and comprising a sequence absolutely 

20 complementary to an at least ten nucleotide portion of an RNA 
transcript specific to a vertebrate Serrate gene, which 
oligonucleotide is hybridizable to the RNA transcript. 

79. A pharmaceutical composition comprising the 
25 oligonucleotide of claim 78; and a pharmaceutical^ 

acceptable carrier. 

80. A method of inhibiting the expression of a 
nucleic acid sequence encoding a Serrate protein in a cell 

30 comprising providing the cell with an effective amount of the 
oligonucleotide of claim 78. 

81. A method of diagnosing a disease or disorder 
characterized by an aberrant level of Notch-Serrate protein 

35 binding activity in a patient, comprising measuring the 
ability of a Notch protein in a sample derived from the 
patient to bind to a vertebrate Serrate protein, in which an 
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increase or decrease in the ability of the Notch protein to 
bmd to the serrate protein, relative to the ability found i 
an analogous sample from a normal individual, indicates the 
presence of the disease or disorder in the patient. 

82. A method of diagnosing a disease or disorder 
characterized by an aberrant level of Serrate protein in a 
patient, comprising measuring the levels of a vertebrate 
Serrate protein in a sample derived from the patient, in 
10 which an increase or decrease in the levels of the Serrate 
protein, relative to the levels of the Serrate protein found 
in an analogous sample from a normal individual, indicates 
the presence of the disease or disorder in the patient. 

15 



20 



25 



30 



35 
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1/20 

10 20 30 

GAATTCCCCT CCCCCCTTTT TCCATGCAGC 
70 80 90 

AATCATAATA ATAAAAGAAG GGGAGCGCGA 
130 140 150 

GGAGGGGGAG CGTCTCAAAG AAGCGATCAG 
190 200 210 

TCTGGAAGGG GCCGCTCTTG AAAGGGCTTT 
250 260 270 

TGCTCCAATC GGCGGAGTAT ATTAGAGCCG 
310 320 330 

AGCACCGGCG GCAGCACCAG CGCGAACAGC 
370 380 390 

GCGCGCAGCG ATGCGTTCCC CACGGACACG 
MRS P R T R 
430 440 450 

GCTCGCCCTG CTCTGTGCCC TGCGAGCCAA 
L A L L C A L R A K 
490 500 510 

GGAGATCCTG TCCATGCAGA ACGTGAACGG 
E1L SMQ NVNG 
550 560 570 

CGCCCGGAAC CCGGGAGACC GCAAGTGCAC 
ARN PGD RKCT 
610 620 630 

GTGCCTCAAG GAGTATCAGT CCCGCGTCAC 
CLK EYQ SRVT 
670 680 690 

GTCCACGCCT GTCATCGGGG GCAACACCTT 
STP VIG GNTF 
730 740 750 

GAACCGCATC GTGCTGCCTT TCAGTTTCGC 
N R I V L P F S F A 
790 800 810 

GGCGTGGGAT TCCAGTAATG ACACCGTTCA 
AWD SSN DTVQ 
850 860 870 

CTCGGGCATG ATCAACCCCA GCCGGCAGTG 
SGM INP SRQW 

FIG. 
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2/20 

910 920 930 940 950 960 

CCACTTTGAG TATCAGATCC GCGTGACCTG TGATGACTAC TACTATGGCT TTGGCTGTAA 
HFE Y Q I RVTC ODY YYG F G C N> 
970 980 990 1000 1010 1020 

TAAGTTCTGC CGCCCCAGAG ATGACTTCTT TGGACACTAT GCCTGTGACC AGAATGGCAA 
K F C RPR DDFF GHY ACD QNGN> 
1030 1040 1050 1060 1070 1080 

CAAAACTTGC ATGGAAGGCT GGATGGGCCC CGAATGTAAC AGAGCTATTT GCCGACAAGG 
KTC MEG WMGP ECN R A I CRQG> 
1090 1100 1110 1120 1130 1140 

CTGCAGTCCT AAGCATGGGT CTTGCAAACT CCCAGGTGAC TGCAGGTGCC AGTACGGCTG 
CSP KHG SCKL PGO CRC QYGW> 
1150 1160 1170 1180 1190 1200 

GCAAGGCCTG TACTGTGATA AGTGCATCCC ACACCCGGGA TGCGTCCACG GCATCTGTAA 
QGL YCD KC1P HPG CVH G I C N> 
1210 1220 1230 1240 1250 1260 

TGAGCCCTGG CAGTGCCTCT GTGAGACCAA CTGGGGCGGC CAGCTCTGTG ACAAAGATCT 
EPW QCL CETN WGG QLC D KDL> 
1270 1280 1290 1300 1310 1320 

CAATTACTGT GGGACTCATC AGCCGTGTCT CAACGGGGGA ACTTGTAGCA ACACAGGCCC 
NYC GTH QPCL NGG TCS N T G P> 
1330 1340 1350 1360 . 1370 1380 

TGACAAATAT CAGTGTTCCT GCCCTGAGGG GTATTCAGGA CCCAACTGTG AAATTGCTGA 
D. K Y QCS CPEG YSG PNC E I A E> 
1390 1400 1410 1420 1430 1440 

GCACGCCTGC CTCTCTGATC CCTGTCACAA CAGAGGCAGC TGTAAGGAGA CCTCCCTGGG 
HAC LSD PCHN RGS C K E T S L G> 
1450 1460 1470 1480 1490 1500 

CTTTGAGTGT GAGTGTTCCC CAGGCTGGAC CGGCCCCACA TGCTCTACAA ACATTGATGA 
FEC ECS PGWT GPT CST N I D D> 
1510 1520 1530 1540 1550 1560 

CTGTTCTCCT AATAACTGTT CCCACGGGGG CACCTGCCAG GACCTGGTTA ACGGATTTAA 
CSP NNC SHGG TCQ DLV NGFK> 
1570 1580 1590 1600 1610 1620 

GTGTGTGTGC CCCCCACAGT GGACTGGGAA AACGTGCCAG TTAGATGCAA ATGAATGTGA 
CVC PPO WTGK TCQ LDA NECE> 
1630 1640 1650 1660 1670 1680 

GGCCAAACCT TGTGTAAACG CCAAATCCTG TAAGAATCTC ATTGCCAGCT ACTACTGCGA 
A K P CVN AKSC KNL IAS YYCD> 
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1690 1700 1710 1720 1730 1740 

CTGTCTTCCC GGCTGGATGG GTCAGAATTG TGACATAAAT ATTAATGACT GCCTTGGCCA 

CLP GWM GQNC DIN IND CLGQ> 

1750 1760 1770 1780 1790 1800 

GTGTCAGAAT GACGCCTCCT GTCGGGATTT GGTTAATGGT TATCGCTGTA TCTGTCCACC 

CQN DAS CRDL VNG YRC I C P P> 

1810 1820 1830 1840 1850 1860 

TGGCTATGCA GGCGATCACT GTGAGAGAGA CATCGATGAA TGTGCCAGCA ACCCCTGTTT 

GYA GDH CERD IDE CAS NPCL> 

1870 1880 1890 1900 1910 1920 

GAATGGGGGT CACTGTCAGA ATGAAATCAA CAGATTCCAG TGTCTGTGTC CCACTGGTTT 

NGG HCO NEIN RFQ CLC PTGF> 

1930 1940 1950 1960 1970 1980 

CTCTGGAAAC CTCTGTCAGC TGGACATCGA TTATTGTGAG CCTAATCCCT GCCAGAACGG 

SGN LCO LDID YCE PNP CQNG> 

1990 2000 2010 2020 2030 2040 

TGCCCAGTGC TACAACCGTG CCAGTGACTA TTTCTGCAAG TGCCCCGAGG ACTATGAGGG 

AQC YNR ASDY FCK CPE DYEG> 

2050 2060 2070 2080 2090 2100 

CAAGAACTGC TCACACCTGA AAGACCACTG CCGCACGACC CCCTGTGAAG TGATTGACAG 

KNC SHL KDHC RTT PCE V I D S> 

2110 2120 2130 2140 2150 2160 

CTGCACAGTG GCCATGGCTT CCAACGACAC ACCTGAAGGG GTGCGGTATA TTTCCTCCAA 

CTV A M A SN.DT PEG VRY I S S N> 

2170 2180 2190 2200 2210 2220 

CGTCTGTGGT CCTCACGGGA AGTGCAAGAG TCAGTCGGGA GGCAAATTCA CCTGTGACTG 

VCG PHG KCKS QSG GKF TCDC> 

2230 2240 2250 2260 2270 2280 

TAACAAAGGC TTCACGGGAA CATACTGCCA TGAAAATATT AATGACTGTG AGAGCAACCC 

NKG FTG TYCH ENI NDC ESNP> 

2290 2300 2310 2320 2330 2340 

TTGTAGAAAC GGTGGCACTT GCATCGATGG TGTCAACTCC TACMGTGCA TCTGTAGTGA 

CRN GGi CIDG VNS YKC I C S D> 

2350 2360 2370 2380 2390 2400 

CGGCTGGGAG GGGGCCTACT GTGAAACCAA TATTAATGAC TGCAGCCAGA ACCCCTGCCA 

GWE GAY CETN IND CSQ NPCH> 

2410 2420 2430 2440 2450 2460 

CAATGGGGGC ACGTGTCGCG ACCTGGTCAA TGACTTCTAC TGTGACTGTA AAAATGGGTG 

NGG TCR DLVN DFY CDC KNGW> 
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2470 2480 2490 2500 2510 2520 

GAAAGGAAAG ACCTGCCACT CACGTGACAG TCAGTGTGAT GAGGCCACGT GCAACAACGG 
KGK TCH SRDS QCD EAT C N N G> 
2530 2540 2550 2560 2570 2580 

TGGCACCTGC TATGATGAGG GGGATGCTTT TAAGTGCATG TGTCCTGGCG GCTGGGAAGG 
GTC YDE GDAF KCM CPG GWEG> 
2590 2600 2610 2620 2630 2640 

AACAACCTGT AACATAGCCC GAAACAGTAG CTGCCTGCCC AACCCCTGCC ATAATGGGGG 
TTC NIA RNSS CLP NP C H N G G> 
2650 2660 2670 2680 2690 2700 

CACATGTGTG GTCAACGGCG AGTCCTTTAC GTGCGTCTGC AAGGAAGGCT GGGAGGGGCC 
T C V V N G E S F T C V C KEG W E G P> 
2710 2720 2730 2740 2750 2760 

CATCTGTGCT CAGAATACCA ATGACTGCAG CCCTCATCCC TGTTACAACA GCGGCACCTG 
ICA QNT NDCS PHP CYN SGTO 
2770 2780 2790 2800 2810 2820 

TGTGGATGGA GACAACTGGT ACCGGTGCGA ATGTGCCCCG GGTTTTGCTG GGCCCGACTG 
VDG DNW YRCE CAP GFA GP0C> 
2830 2840 2850 2860 2870 2880 

CAGAATAAAC ATCAATGAAT GCCAGTCTTC ACCTTGTGCC TTTGGAGCGA CCTGTGTGGA 
R I N 1NE CQSS PCA FGA T C V D> 
2890 2900 2910 2920 2930 2940 

TGAGATCAAT GGCTACCGGT GTGTCTGCCC TCCAGGGCAC AGTGGTGCCA AGTGCCAGGA 
E I N GYR CVCP PGH SGA KCQE> 
2950 2960 2970 2980 2990 3000 

AGTTTCAGGG AGACCTTGCA TCACCATGGG GAGTGTGATA CCAGATGGGG CCAAATGGGA 
VSG RPC ITMG SVI PDG A K W D> 
3010 3020 3030 3040 3050 3060 

TGATGACTGT AATACCTGCC AGTGCCTGAA TGGACGGATC GCCTGCTCAA AGGTCTGGTG 
DDC NTC QCLN GR1 AC S KVWC> 
3070 3080 3090 3100 3110 3120 

TGGCCCTCGA CCTTGCCTGC TCCACAAAGG GCACAGCGAG TGCCCCAGCG GGCAGAGCTG 
GPR PCL LHKG HSE CPS GQSC> 
3130 3140 3150 3160 3170 3180 

CATCCCCATC CTGGACGACC AGTGCTTCGT CCACCCCTGC ACTGGTGTGG GCGAGTGTCG 
IPI LDD OCFV HPC TGV GECR> 
3190 3200 3210 3220 3230 3240 

GTCTTCCAGT CTCCAGCCGG TGAAGACAAA GTGCACCTCT GACTCCTATT ACCAGGATAA 
SSS LQP VKTK CI S DSY YQDN> 
3250 3260 3270 3280 3290 3300 

CTGTGCGAAC ATCACATTTA CCTTTAACAA GGAGATGATG TCACCAGGTC TTACTACGGA 
CAN ITT TFNK EMM SPG LTTE> 
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3310 3320 3330 3340 3350 3360 

GCACATTTGC AGTGAATTGA GGAATTTGAA TATTTTGAAG AATGTTTCCG CTGAATATTC 
H I C SEL RNLN ILK N V S AEYS> 
3370 3380 3390 3400 3410 3420 

AATCTACATC GCTTGCGAGC CTTCCCCTTC AGCGAACAAT GAAATACATG TGGCCATTTC 
j Y 1 ACE PS'PS ANN EIH VAIS> 
3430 3440 3450 3460 3470 3480 

TGCTGAAGAT ATACGGGATG ATGGGAACCC GATCAAGGAA ATCACTGACA AAATAATCGA 
AED 1 R D DGNP IKE ITD KIID> 
3490 3500 3510 3520 3530 3540 

TCTTGTTACT AAACGTGATG GAAACAGCTC GCTGATTGCT GCCGTTGAAG AAGTAAGAGT 
I V T K R D G N S S L I A AVE E V R V> 
3550 3560 3570 3580 3590 3600 

TCAGAGGCGG CCTCTGAAGA ACAGAACAGA TTTCCTTGTT CCCTTGCTGA GCTCTGTCTT 
Q R R P L K N R T D F L V PL L S S V L> 
3610 3620 3630 3640 3650 3660 

AACTGTGGCT TGGATCTGTT GCTTGGTGAC GGCCTTCTAC TGGTGCCTGC GGAAGCGGCG 
TVA W I C CLVT AFY WCL RKRR> 
3670 3680 3690 ' 3700 3710 3720 

GAAGCCGGGC AGCCACACAC ACTCAGCCTC TGAGGACAAC ACCACCAACA ACGTGCGGGA 
KPG SHT HSAS EDN TTN NVRE> 
3730 3740 3750 3760 3770 3780 

GCAGCTGAAC CAGATCAAAA ACCCCATTGA GAAACATGGG GCCAACACGG TCCCCATCAA 
Q L N Q I K N P 1 E K H G ANT VP I K> 
3790 3800 3810 3820 3830 3840 

GGATTACGAG AACAAGAACT CCAAAATGTC TAAAATAAGG ACACACAATT CTGAAGTAGA 
DYE NKN SKMS KIR THN 5EVE> 
3850 3860 3870 3880 3890 3900 

AGAGGACGAC ATGGACAAAC ACCAGCAGAA AGCCCGGTTT GCCAAGCAGC CGGCGTACAC 
FDD MDK HQQK ARF AKO PA Y -T> 
391 o 3920 3930 3940 3950 3960 

GCTGGTAGAC AGAGAAGAGA AGCCCCCCAA CGGCACGCCG ACAAAACACC CAAACTGGAC 
IVD REE KPPN GTP TKH PNWl> 
3970 3980 3990 4000 4010 4020 

AAACAAACAG GACAACAGAG ACTTGGAAAG TGCCCAGAGC TTAAACCGAA TGGAGTACAT 
WKO DNR DLES AQS L N F MEYI> 
4030 4040 4050 4060 4070 4080 

CGTATAGCAG ACCGCGGGCA CTGCCGCCGC TAGGTAGAGT CTGAGGGCTT GTAGTTCTTT 
V > 
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4090 4100 4110 4120 4130 4140 

AAACTGTCGT GTCATACTCG AGTCTGAGGC CGTTGCTGAC TTAGAATCCC TGTGTTAATT 
4150 4160 4170 4180 4190 4200 

TAGTTTGACA AGCTGGCTTA CACTGGCAAT GGTAGTTCTG TGGTTGGCTG GGAAATCGAG 
4210 4220 4230 4240 4250 4260 

TGGCGCATCT CACAGCTATG CAAAAAGCTA GTCAACAGTA CCCCTGGTTG TGTGTCCCCT 
4270 4280 4290 4300 4310 4320 

TGCAGCCGAC ACGGTCTCGG ATCAGGCTCC CAGGAGCTGC CCAGCCCCCT GGTACTTTGA 
4330 4340 4350 4360 4370 4380 

GCTCCCACTT CTGCCAGATG TCTAATGGTG ATGCAGTCTT AGATCATAGT TTTATTTATA 
4390 4400 4410 4420 4430 4440 

TTTATTGACT CTTGAGTTGT TTTTGTATAT TGGTTTTATG ATGACGTACA AGTAGTTCTG 
4450 4460 4470 4480 4490 4500 

TATTTGAAAG TGCCTTTGCA GCTCAGAACC ACAGCAACGA TCACAAATGA CTTTATTATT 
4510 4520 4530 4540 4550 4560 

TATTTTTTT7 AATTGTATTT TTGTTGTTGG GGGAGGGGAG ACTTTGATGT CAGCAGTTGC 
4570 4580 4590 4600 4610 4620 

TGGTAAAATG AAGAATTTAA AGAAAAAATG TCCAAAAGTA GAACTTTGTA TAGTTATGTA 
4630 4640 4650 4660 4670 4680 

AATAATTCTT TTTTATTAAT CACTGTGTAT ATTTGATTTA TTAACTTAAT AATCAAGAGC 
4690 4700 4710 4720 4730 4740 

CTTAAAACAT CATTCCTTTT TATTTATATG TATGTGTTTA GAATTGAAGG TTTTTGATAG 
4750 4760 4770 4780 4790 4800 

CATTGTAAGC GTATGGCTTT ATTTTTTTGA ACTCTTCTCA TTACTTGTTG CCTATAAGCC 
4810 4820 4830 4840 4850 4860 

AAAAAGGAAA GGGTGTTTTG AAAATAGTTT ATTTTAAAAC AATAGGATGG GCTACACGTA 
4870 4880 4890 4900 4910 4920 

CATAGGTAAA TAATAGCACC GTACTGGTTA TGATGATGAA AATAACTGGA AACTTGAAAG 
4930 4940 4950 4960 4970 4980 

CTTGTGGTAA TGGCAGATAA AGATGGTTCA CCTGGGAAAT TAAAACTTGA ATGGTTGTAC 
4990 5000 5010 5020 5030 5040 

AGAAAAGCAC AGAGTGGMT GCACATCAAT GACAGTAAGG GAGTTAGTTC TAGGAACAGC 
5050 5060 5070 5080 5090 5100 

TCCTGAACAG TAAGATTCCC GCAATAGTCT CCGCCTCGTT CGTCTATGGT ATGCATCCCA 
5110 5120 5130 5140 5150 5160 

TTCATTTTCT TCTTCTGATT ATTGTCATCT TTCCCTTTGC CAAATGGGCA GTTATTGTTT 
5170 5180 5190 5200 5210 5220 

CAGGGAGAGA AGCTGCTCAT TGGCCAATCA TTCTGGTGTG CAGTGCTCCA TCGGATTCTA 
5230 5240 5250 5260 5270 5280 

CATGTCCAAC AAGGCATGTC TGGATGATGC AATGTCTGTC TGACCCCCGG AATTCCGTGC 
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5290 5300 5310 5320 5330 5340 

AGAGACAACA TTCTAGACAG ATATACACTT TTTATTATTA ACAAACTTTG GCCACAACC1 
5350 5360 5370 5380 5390 5400 

TTGATGTATA AATTGCCGGA TTTCCCCAGT CCTTTCATTG TGGCTTTGGA CAGGAGCAGG 
5410 5420 5430 5440 5450 5460 

CTCACTTGTC TGCTTCAGGC TGCCTTTCTC TTGGGTTGCA CCTCAGTTCT TACTTATTTA 
5470 5480 5490 5500 5510 5520 

TTTATTTTGA GTGGAGCATA GGGGCCTCTT CCAAAATGGG TAGAGCTCAG GGGCTTTCTT 
5530 5540 5550 5560 5570 5580 

ATTGAAATGG TCACATGATA AAAACGGGCT GAAAAAGGAG AGTTCCAGGA GAAAAGCCCA 
5590 5600 5610 5620 5630 5640 

GAAAAGGCCC CTCCTCAGAA GACAGCCTTT AAGCCTC I TG CTTACTGAAG GAAGCCCCAC 
5650 5660 5670 5680 5690 5700 

CTTCTAGCAC TGAGGCCGGG TCTGATCTTC CAGAGGAGTT GGAGGAGTCC ATGAGAATGG 
5710 5720 5730 5740 5750 5760 

CCACCATTCT TGCTTGCTGC TGCTGATGTT GCAGTTTTGA GAGAACAGCG GGATCCTTGT 
5770 5780 5790 5800 5810 5820 

TGTCCTCTAG AGACTTGAGT CTGTCACTGA CATTTTTTCA GTTCCTTTGC TCATAGACCA 
5830 5840 5850 5860 5870 5880 

TACGAGGAAT TAGTGATGTG TCAGTTGAGA GTTCACAATC TCATTGTTCA TTTAATTCAC 
5890 5900 5910 5920 5930 5940 

TTTAAAGTTG TCAATTTCTG TGTGAGTAAC CTGTAAAAGA CACCTTTCCA GAAGAGTTTT 
5950 5960 5970 5980 5990 6000 

GCCGTCTGTT TGAAAAAAAA ATCTTTATAA ACTTTCCTAA GTATCTGGAT TTGGATTCCT 
6010 6020 6030 6040 6050 6060 

TATTTGGAGA GAAAATGTAC CCTGTCTCCA CCAAAAATAC AAAAATTAGC CAGGCTTGG1 
6070 6080 6090 6100 6110 6120 

GGTGCACACC GGTAATCCCA GCAACTCTGG AGACTAAGGC AGGAAGAATC GCTTGACCCA 
6130 6140 6150 6160 6170 6180 

GGAGGGTCGA GGCTACAATG AGTTGAAACC GCGCCACTGC ACTCCAGCCT GGGCGACAGT 
6190 6200 6210 6220 6230 6240 

GCGAGGCCCT GTCTCAAAAA TAAAATAAAA TAAATAAATA AATTAGCCAG ATACTGTGTG 
6250 6260 6270 6280 6290 6300 

CACGCCTGCA GTCCCAGCTA TTCTGGAAGC TGAGGTGGGA AGATGGTTAA GCCTGAGAGG 
6310 6320 6330 6340 6350 6360 

ACAAAGCTGC AGTGAGTCAT GTTTGCATCA CTGCACTCCA GCCTGGGTGA CAGAGCAAGA 
6370 638C 6390 6400 6410 6420 

CCCTGTCTAA AAAACAAAAA CAGGCCGGGT GTGGTGGCTC ATGCCTGCCA TCCCAGTGCT 

6430 6440 6450 6460 

TTGGGAGGCA GAGGTTGGCA TAATCCCAGC GCTCTGGGAA TTCC 
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GGCCGGGGCC GGGCGGGCGG GTCGCGGGGG CAATGCGGGC GCAGGGCCGG GGGCGCCTTC 60 
CCCGGCGGCT GCTGCTGCTG CTGGCGCTCT GGGTGCAGGC GGCGCGGCCC ATGGGCTATT 120 
TCGAGCTGCA GCTGAGCGCG CTGCGGAACG TGAACGGGGA GCTGCTGAGC GGCGCCTGCT 180 
GTGACGGCGA CGGCCGGACA ACGCGCGCGG GGGGCTGCGG CCACGACGAG TGCGACACCG 240 
CTCCTTTACC CTCATCGTGG AGGCCTGGGA CTGGGACAAC GATACCACCC CGAATGAGGA 300 
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ACG TGC ATC 
Thr Cys He 

GGC TAC TCG 
Gly Tyr Ser 
170 

AAC CCG TGT 
Asn Pro Cys 

185 
GAA TGC CAC 
Glu Cys His 
200 

ATC GAT GAG 
He Asp Glu 

GAC CAG GTG 
Asp Gin Val 



AAC GCC 
Asn Ala 
155 

GGC AGG 
Gly Arg 

GCC AAC 
Ala Asn 

TGC CCA 
Cys Pro 



GCC ACC TGC 
Ala Thr Cys 
250 

AAC GCT TTT 
Asn Ala Phe 

265 
ATC CCG GGC 
lie Pro Gly 
280 

CGC GGG CAG 
Arg Gly Gin 

TAC CAG TGT 
Tyr Gin Cys 

GAA CGA GAC 
Glu Arg Asp 
330 

GAG GAC CTG 
Glu Asp Leu 
345 



TGT GCT 
Cys Ala 
220 
GAC GGC 
Asp Gly 
235 

CAG CTG 
Gin Leu 



GAG CCT 
Glu Pro 

AAC TGT 
Asn Cys 

GGG GGC 
Gly Gly 
190 
TCG GGC 
Ser Gly 
205 

TCG AAC 
Ser Asn 



GAC CAG TAC 
Asp Gin Tyr 

160 
GAG AAG GCT 
Glu Lys Ala 
175 

TCT TGC CAT 
Ser Cys His 

TGG AGC GGG 
Trp Ser Gly 



TTT GAG 
Phe Glu 

GAC GCC 
Asp Ala 



TCT TGC 
Ser Cys 

TGG AAG 
Trp Lys 

TGT CAG 
Cys Gin 
300 
GTG TGC 
Val Cys 
315 

AAG TGT 
Lys Cys 

GCC GAC 
Ala Asp 



AAA AAC 
Lys Asn 
270 
GGC ATC 
Gly He 
285 

CAT GGG 
His Gly 

CCA CGG 
Pro Arg 

GCC AGC 
Ala Ser 

GGC TTC 
Gly Phe 
350 



CCG TGT GCG 
Pro Cys Ala 
225 

TGC ATC TGC 
Cys He Cys 

240 
AAT GAG TGT 
Asn Glu Cys 
255 

CTG ATT GGC 
Leu He Gly 



CGC TGC 
Arg Cys 

GAG CAC 
Glu His 

GAG GTG 
Glu Val 
195 
CCC ACC 
Pro Thr 
210 

GCC GGT 
Ala Gly 



ACC TGC 
Thr Cys 
165 
GCC TGC 
Ala Cys 
180 

CCG TCC 
Pro Ser 

TGT GCC 
Cys Ala 

GGC ACC 
Gly Thr 



AAC TGC CAT 
Asn Cys His 



GGC ACC TGC 
Gly Thr Cys 
305 

GGC TTC GGA 
Gly Phe Gly 

320 
AGC CCC TGC 
Ser Pro Cys 
335 

CAC TGC CAC 
His Cys His 



CCC GAG 
Pro Glu 

GAA GGG 
Glu Gly 

GGC TAT 
Gly Tyr 
275 
ATC AAC 
lie Asn 
290 

AAG GAC 
Lys Asp 



CAG TGG 
Gin Trp 
245 
AAG CCA 
Lys Pro 
260 

TAC TGT 
Tyr Cys 

GTC AAC 
Val Asn 

CTG GTG 
Leu Val 



CCT GAC 832 
Pro Asp 

ACC TCC 880 
Thr Ser 

GGC TTC 928 
Gly Phe 

CTT GAC 976 
Leu Asp 
215 

TGT GTG 1024 
Cys Val 
230 

GTG GGG 1072 
Val Gly 



GGC CGG 
Gly Arg 

CAC AGC 
His Ser 

TGC CCC 
Cys Pro 
355 



CAT TGC 
His Cys 
325 
GGC GGC 
Gly Gly 
340 

CAG GGC 
Gin Gly 



TGC CTT 1120 
Cys Leu 

GAT TGC 1168 
Asp Cys 

GAC TGT 1216 
Asp Cys 
295 

AAC GGG 1264 
Asn Gly 
310 

GAG CTG 1312 
Glu Leu 



CTC TGC 1360 
Leu Cys 

TTC TCC 1408 
Phe Ser 
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GGG CCT 
Gly Pro 
360 

CGG AAC 
Arg Asn 

TGC CCT 
Cys Pro 

TGC CCT 
Cys Pro 

GGG CCT 
Gly Pro 
425 
GGA CGC 
Gly Arg 
440 

AGT GGC 
Ser Gly 

GGC CAG 
Gly Gin 

TTC CGC 
Phe Arg 

AAT CCC 
Asn Pro 
505 
TAC GAC 
Tyr Asp 
520 

GGC AAG 
Gly Lys 

AGC AAC 
Ser Asn 

TGC CCC 
Cys Pro 



CTC TGT GAG 
Leu Cys Glu 

GGC GCT CGC 
Gly Ala Arg 
380 

GAT GAC TTT 
Asp Asp Phe 

395 
GGC GGG GCC 
Gly Gly Ala 
410 

GGG ATG CCT 
Gly Met Pro 

TGC GTC AGC 
Cys Val Ser 

TTT ACT GGC 
Phe Thr Gly 
460 

CCC TGC CGC 
Pro Cys Arg 

475 
TGC TTC TGC 
Cys Phe Cys 
490 

AAC GAC TGC 
Asn Asp Cys 

CTG GTC AAT 
Leu Val Asn 

ACC TGC CAC 
Thr Cys His 
540 

GGT GGC ACC 
Gly Gly Thr 

555 
CCC GGC TGG 
Pro Gly Trp 
570 



GTG GAT 
Val Asp 
365 

TGC TAT 

Cys Tyr 

GGT GGC 
Gly Gly 

TGC AGA 
Cys Arg 

GGC ACA 
Gly Thr 
430 
CAG CCA 
Gin Pro 
445 

ACC TAC 
Thr Tyr 

AAT GGG 
Asn Gly 

CCC AGC 
Pro Ser 

CTT CCC 
Leu Pro 
510 
GAC TTC 
Asp Phe 
525 

TCA CGC 
Ser Arg 

TGC TAC 
Cys lyr 

AAG GGC 
Lys Gly 



GTC GAC 
Val Asp 

AAC CTG 
Asn Leu 

AAG AAC 
Lys Asn 
400 
GTG ATC 
Val He 
415 

GCA GCC 
Ala Ala 

GGG GGC 
Gly Gly 

TGC CAT 
Cys His 

GGC ACA 
Gly Thr 
480 
GGT TGG 
Gly TrD 
495 

GAT CCC 
Asp Pro 

TAC TGT 
lyr Cys 

GAG TTC 
Glu Phe 

GAC AGC 
Asp Ser 
560 
AGC ACC 
Ser Inr 
575 



CTT TGT GAG 
Leu Cys Glu 

370 
GAG GGT GAC 
Glu Gly Asp 
385 

TGC TCC GTG 
Cys Ser Val 

GAT GGC TGC 
Asp Gly Cys 

TCC GGC GTG 
Ser Gly Val 
435 

AAC TTT TCC 
Asn Phe Ser 

450 
GAG AAC ATT 
Glu Asn He 
465 

TGC ATC GAT 
Cys He Asp 

GAG GGC GAG 
Glu Gly Glu 

TGC CAC AGC 
Cys His Ser 
515 

GCG TGC GAC 
Ala Cys Asp 

530 
CAG TGC GAT 
Gin Cys Asp 
545 

GGC GAC ACC 
Gly Asp Thr 

TGC GCC GTC 
Cys Ala Val 



CCA AGC 
Pro Ser 

TAT TAC 
Tyr Tyr 

CCC CGC 
Pro Arg 
405 
GGG TCA 
Gly Ser 
420 

TGT GGC 
Cys Gly 

TGC ATC 
Cys He 

GAC GAC 
Asp Asp 

GAG GTG 
Glu Val 
485 
CTC TGC 
Leu Cys 
500 

CGC GGC 
Arg Gly 

GAC GGC 
Asp Gly 

GCC TAC 
Ala Tyr 

TTC CGC 
Phe Arg 
565 
GCC AAG 
Ala Lys 
580 



CCC TGC 
Pro Cys 
375 
TGC GCC 
Cys Ala 
390 

GAG CCG 
Glu Pro 

GAC GCG 
Asp Ala 

CCC CAT 
Pro His 

TGT GAC 
Cys Asp 
455 
TGC CTG 
Cys Leu 
470 

GAC GCC 
Asp Ala 

GAC ACC 
Asp Thr 

CGC TGC 
Arg Cys 

TGG AAG 
Trp Lys 
535 
ACC TGC 
Thr Cys 
550 

TGC GCC 
Cys Ala 

AAC AGC 
Asn Ser 



1456 
1504 
1552 
1600 
1648 
1696 
1744 
1792 
1840 
1888 
1936 
1984 
2032 
2080 
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,rr tcc fir CCC AAC CCC TGT GTG AAT GGT GGC ACC TGC GTG GGC AGC 2128 
S Cys S Pro Mn.Pro Cys Val Asn Gly Gly Thr Cys V.1 Gly Ser 

rrr rfc TCC TTC TCC TGC ATC TGC CGG GAC GGC TGG GAG GGT CGT ACT 2176 
g SS Se C r Jhe 2 Cys He Cys Arg Asp Gly Trp Glu Gly Arg Thr 

?? 0 C ACT CAC AAT ACC MC GAC TGC AAC CCT CTG CCT TGC TAG AAT GGT 2224 
Cys fhr ms J£ Thr Asn Asp Cys Asn Pro Leu Pro Cys Tyr Asn Gly 

T r tpt ptt rfc CCC GTC AAC TGG TTC CGC TGC GAG TGT GCA CCT 2272 
S ?S SI S! Z S 5E «„ Trp P,e Ar 9 C,s G, U Cys AT Pro 

TTC GCG II OCT GAC TGC « ATC MC ATC GAC GAG TGC CAG TCC ,3,0 
650 °£ T „ T r* at CAft ATr AAC GGG TAT 2368 



Sf, S ™ SS Pro S £ 2? Yu Asn He Asp GTu Cys S,n Ser 

™ « 1 S SE S S 5 SI SS 5 K ffi 55 S S 
Arg S S 5 S S £ £ K S £ S. 5 K SI S 2416 
I S K SS 2! § SS 5 £ £ S S S S Si 85 2464 

- 5 S S £ s a s 5 s s 5 S s 2 2 2560 

52SSS2SS15SKS2S2SSS- 
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AAC CGT GAC CAC GTG CCC CAG GGC ACC ACG GTG GGC GCC ATT TGC TCC 2800 
Asn Arg Asp His Val Pro Gin Gly Thr Thr Val Gly Ala He Cys Ser 

810 815 820 

GGG ATC CGC TCC CTG CCA GCC ACA AGG GCT GTG GCA CGG GAC CGC CTG 2848 
Gly He Arg Ser Leu Pro Ala Thr Arg Ala Val Ala Arg Asp Arg Leu 

825 830 835 

CTG GTG TTG CTT TGC GAC CGG GCG TCC TCG GGG GCC AGT GCT GTG GAG 2896 
Leu Val Leu Leu Cys Asp Arg Ala Ser Ser Gly Ala Ser Ala Val Glu 
840 845 850 855 

GTG GCC GTG TCC TTC AGC CCT GCC AGG GAC CTG CCT GAC AGC AGC CTG 2944 
Val Ala Val Ser Phe Ser Pro Ala Arg Asp Leu Pro Asp Ser Ser Leu 

860 865 870 

ATC CAG GGC GCG GCC CAC GCC ATC GTG GCC GCC ATC ACC CAG CGG GGG 2992 
He Gin Gly Ala Ala His Ala He Val Ala Ala He Thr Gin Arg Gly 

875 880 885 

AAC AGC TCA CTG CTC CTG GCT GTC ACC GAG GTC AAG GTG GAG ACG GTT 3040 
Asn Ser Ser Leu Leu Leu Ala Val Thr Glu Val Lys Val Glu Thr Val 

890 895 900 

GTT ACG GGC GGC TCT TCC ACA GGT CTG CTG GTG CCT GTG CTG TGT GGT 3088 
Val Thr Gly Gly Ser Ser Thr Gly Leu Leu Val Pro Val Leu Cys Gly 

905 910 915 

GCC TTC AGC GTG CTG TGG CTG GCG TGC GTG GTC CTG TGC GTG TGG TGG 3136 
Ala Phe Ser Val Leu Trp Leu Ala Cys Val Val Leu Cys Val Trp Trp 
920 925 930 935 

ACA CGC AAG CGC AGG AAA GAG CGG GAG AGG AGC CGG CTG CCG CGG GAG 3184 
Thr Arg Lys Arg Arg Lys Glu Arg Glu Arg Ser Arg Leu Pro Arg Glu 

940 945 950 

GAG AGC GCC AAC AAC CAG TGG GCC CCG CTC AAC CCC ATC CGC AAC CCC 323? 
Glu Ser Ala Asn Asn Gin Trp Ala Pro Leu Asn Pro He Arg Asn Pro 

955 960 965 

ATT GAG CGG CCG GGG GGG CAC AAG GAC GTG CTC TAC CAG TGC AAG AAC 3280 
lie Glu Arg Pro Gly Gly His Lys Asp Val Leu lyr Gin Cys Lys Asn 

970 975 980 

TTC ACT CCA CCG CCG CGC AGG CGC TGC CCG GGC CGG CCG GCC ACG CGG 3326 
Phe Thr Pro Pro Pro Arg Arg Arg Cys Pro Gly Arg Pro Ala Thr Arg 

985 990 995 

CCG TCA GGG AGG ATG AGG AGG ACG AGG ATC TTG GCC GCG GTG AGG AGG 3376 
Pro Ser Gly Arg Met Arg Arg Thr Arg He Leu Ala Ala Val Arg Arg 
1000 1005 1010 1015 

ACT CCC TGG AGG CGG AGA AGT TCC TCT CAC ACA AAT TCA CCA AAG ATC 3424 
Thr Pro Trp Arg Arg Arg Ser Ser Ser His Thr Asn Ser Pro Lys He 
1020 1025 1030 
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CTG GCC GCT CGC CGG GGA GGC CGG CCC ACT GGG CCT CAG GCC CCA AAG 
eu Ala Ala Arg Arg Gly Gly Arg Pro Thr Gly Pro Gin A a Pro Lys 

,0^ 1040 10 4b 

TGG ACA ACC GCG CGG TCA GGA GCA TCA ATG AGG CCC GCT ACG TCG GCA 
Trp nr W Xa Arg Ser Gly Ala Ser Met Arg Pro Ala Thr Ser Ala 

irmn 1055 10bU 

AGG GAA GTA GGG CGG CTG CAG CTG GGC CGG GAC CCA GGG CCC TCG GTG 
frg Z Val Gly Arg Leu Gin Leu Gly Arg Asp Pro Gly Pro Ser Val 

, nA r 1070 107b 

CGA GCC ATG CCG TCT GCC GGA CCC GGA GGC CGA GGC CAT GTG CAT AGT 
Gly Sa Me? Pro Ser Ala^Gly Pro Gly Gly Argtfy His Val ms Ser^ 

T?C 0 nT ATT TTG TGT AAA^AAA ACC ACC AAA AAC AAA AAC CAA ATG TTT 
Se JTe He leu Cys Lys Lys Thr Thr Lys Asn Lys Asn Gin Met Phe 

1100 iiiU 
ATT TTC TAC GTT TCT TTA ACC TTG TAT AAA TTA TTC AGT AAC TGT CAG 
ne Phe Tyr Val Ser Leu Thr Leu Tyr Lys Leu Phe Ser Asn Cys Gin 

1115 iJ -" 
GCT GAA AAC AAT GGA GTA TTC TCG GAT AGT TGC TAT TTT TGT AAA GTA 
Sa tTu itS Jsn Gly Val Phe Ser^Asp Ser Cys Tyr PheCys Lys Val 

rrr C1C fcfcGC ACT CGC TGT ATGAAA GGA GAG AGC AAA GGG TGT CTG 
Xa Si Arg lUr Arg Cys Met Lys Gly Glu Ser Lys Gly Cys Leu 

■m/ic; 1150 Hob 

rCT en r.AC CAA ATC GTC GCG TTT GTT ACC AGA GGT TGT GCA CTG TTT 
Z S SS G^ val Al. Phe V.1 Thr *,By Cys A,a Leu « 

S Si S 2 S 2 2 S K 2 S S £5 



3472 
3520 
3568 
3616 
3664 
3712 
3760 
3808 
3856 
3904' 
395? 
4000 
4048 
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GGT GGG ACC CTG GTT ATT GAT GTG GCC CTG GCT GCC GGC ACG GCC CGT 4096 

Gly Gly Thr Leu Val lie Asp Val Ala Leu Ala Ala Gly Thr Ala Arg 

1240 1245 1250 1255 

GGC TGT TG ACGCACCTGT GGTTGTTAGT GGGGCCTGAG GTCATCGGCG TGGCCCAAGG 4154 

Gly Cys 

CCGGCAGGTC AACCTCGCGC TTGCTGGCCA GTCCACCCTG CCTGCCGTCT GTGCTTCCTC 4214 

CTGCCCAGAA CGCCCGCTCC AGCGATCTCT CCACTGTGCT TTCAGAAGTG CCCTTCCTGC 4274 

TGCGCAGTTC TCCCATCCTG GGACGGCGGC AGTATTGAAG CTCGTGACAA GTGCCTTCAC 4334 

ACAGACCCCT CGCAACTGTC CACGCGTGCC GTGGCACCAG GCGCTGCCCA CCTGCCGGCC 4394 

CCGGCCGCCC CTCCTCGTGA AAGTGCATTT TTGTAAATGT GTACATATTA AAGGAAGCAC 4454 

TCTGTATAAA AAAAAAAAAC CGGAATTCC 4483 
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CAGGTGGCGTCAGCATCGGGACAGTTCGAGCTGGAGATCTTATCCGTGCAGAATGTGAACGGCGTGCT 

GCAGAACGGGAACTGCTGCGACGGCACTCGAAACCCCGGAGATAAAAAGTGCACCAGAGATGAGTGTG 

ACACCTACTTTAAAGTTTGCCTGAAGGAGTACCAGTCGCGGGTCACTGCTGGCGGCCCTTGCAGCTTC 

GGATCCAAATCCACCCCTGTCATCGGCGGGAATACCTTCAATTTAAAGTACAGCCGGAATAATGAAAA 

GAACCGGATTGTTATCCCTTTCACGTTCGCCTGGCCGAGATCCTACACGTTGCTTGTTGAGGCATGGG 

ATTACAATGATAACTCTACTAATCCCGATCGCATAATTGAGAAGGCATCCCACTCTGGCATGATCAAT 

CCAAGCCGTCAGTGGCAGACGTTGAAACATAACACAGGAGCTGCCCACTTTGAGTATCAAATCCGTGT 

GACTTGCGCAGAACATTACTATGGCTTTGGATGCAACAAGTTTTGTCGACCGAGAGATGACTTCTTCA 

CTCACCATACCTGTGACCAGAATGGCAACAAAACCTGCTTGGAAGGCTGGACGGGACCAGAATGCAAC 

AAAGCTATTTGTCGTCAGGGATGTAGCCCCAAGCATGGTTCTTGCACAGTTCCAGGAGAGTGCAGGTG 

TCAGTATGGATGGCAAGGCCAGTACTGTGATAAGTGCATTCCACACCCGGGATGTGTCCATGGCACTT 

GCATTGAACCATGGCAGTGCCTCTGTGAAACCAACTGGGGTGGTCAGCTCTGTGACAAAGACCTGAAC 

TACTGTGGAACCCACCCACCCTGTTTGAATGGTGGTACCTGCAGCAACACTGGCCCCGATAAATACCA 

GTGTTCCTGCCCTGAGGGTTACTCAGGACAGMCTGTGAAATAGCGGAGCATGCGTGCCTCTCTGATC 

CGTGCCACAACGGAGGAAGCTGCCTAGAAACGTCTACAGGATTTGAATGTGTGTGTGCACCTGGCTGG 

GCTGGACCAACTTGCACTGATAATATTGATGATTGTTCTCCAAATCCCTGTGGTCATGGAGGAACTTG 

CCAAGATCTAGTTGATGGATTTAAGTGTATTTGCCCACCTCAGTGGACTGGCAAAACATGCCAGCTAG 

ATGCGAATGAATGTGAGGGCAAACCCTGTGTCAATGCCAACTCCTGCAGGAACTTGATTGGCAGCTAC 

TATTGTGACTGCATTACTGGCTGGTCTGGCCACAACTGTGATATAAATATTAATGATTGTCGTGGACA 

ATGTCAGAATGGAGGATCCTGTCGGGACTTGGTTAATGGTTATCGGTGCATCTGTTCACCTGGCTATG 

CAGGAGATCACTGTGAGAAAGACATCAATGAATGTGCAAGTAACCCTTGCATGAATGGGGGTCACTGC 

CAGGATGAAATCAATGGATTCCAATGTCTGTGTCCTGCTGGTTTCTCAGGAAACCTCTGTCAGCTGGA 

TATAGACTACTGTGAGCCAAACCCTTGCCAGAACGGTGCCCAGTGCTTCAATCTTGCTATGGACTATT 

TCTGTAACTGCCCTGAAGATTACGAAGGCAAGAACTGCTCCCACCTGAAAGATCACTGCCGCACAACT 

CCTTGTGAAGTAATCGACAGCTGTACAGTGGCAGTGGCTTCTAACAGCACACCAGAAGGAGTTCGTTA 

CATTTCTTCAAATGTCTGTGGTCCTCATGGAAAATGCAAGAGCCAAGCAGGTGGAAAATTCACCTGTG 

AATGCAACAAAGGATTCACTGGCACCTACTGTCATGAGAATATCAATGACTGTGAGAGCAACCCCTGT 

AAAAATGGTGGCACTTGTATTGACGGTGTAAACTCCTACAAATGTATTTGTAGTGATGGATGGGAAGG 

AACATATTGTGAAACAAATATTAATGACTGCAGTAAAAACCCCTGCCACAATGGAGGAACTTGCCGAG 

ACTTGGTCAATGACTTCTTCTGTGAATGTAAAAATGGGTGGAAAGGAAAAACTTGCCACTCTCGTGAC 

AGCCAGTGTGATGAGGCAACATGCAATAATGGAGGAACATGTTATGATGAGGGGGACACTTTCAAGTG 

CATGTGTCCTGCAGGATGGGAAGGAGCCACTTGTAATATAGCAAGGAACAGCAGCTGCCTGCCAAACC 

CCTGTCACAATGGTGGTACCTGTGTAGTTAGTGGGGATTCTTTCACTTGTGTCTGCAAGGAGGGCTGG 

GAAGGACCGACATGTACTCAGAACACAAATGACTGCAGTCCTCATCCTTGTTACAACAGTGGTACTTG 

TGTGGATGGAGACAACTGGTACCGCTGTGAGTGCGCTCCCGGCTTCGCAGGTCCCGACTGTAGGATCA 

ACATCAATGAATGTCAGTCTTCACCCTGTGCCTTTGGGGCTACTTGTGTGGATGAAATTAATGGGTAC 

CGTTGCATTTGTCCACCGGGTCGCAGTGGTCCAGGATGCCAGGAAGTTACAGGGAGGCCTTGCTTTAC 

CAGTATTCGAGTAATGCCAGACGGTGCTAAGTGGGATGATGACTGTAATACTTGTCAGTGTTTGAATG 

GAAAAGTCACCTGTTCTAAGGTTTGGTGTGGTCCTCGACCTTGTATAATACATGCCAAAGGTCATAAT 

GAATGCCCAGCTGGACACGCTTGTGTTCCTGTTAAAGAAGACCATTGTTTCACTCATCCTTGTGCTGC 
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AGTGGGTGAATGCTGGCCTTCTAATCAGCAGCCTGTGAAGACCAAATGCAATTCTGATTCTTATTACC 
AAGATAATTGTGCCAACATCACCTTCACCTTTAATAAGGAAATGATGGCACCAGGCCTTACCACGGAG 
CACATTTGCAGTGAATTGAGGAATCTGAATATCCTGAAGAATGTTTCTGCTGAATATTCCATCTATAT 
TACCTGTGAGCCTTCACACTTGGCAAATAATGAAATACATGTTGCTATTTCTGCTGAAGATATAGGAG 
AAGATGAAAACCCAATCAAGGAAATCACAGATAAGATTATTGACCTTGTCAGTAAGCGTGATGGAAAC 
AACACACTAATTGCTGCAGTCGCAGAAGTCAGAGTACAAAGGCGACCAGTTAAGAACAAAACAGATTT 
CTTGGTGCCATTACTGAGCTCAGTCTTAACAGTAGCCTGGATCTGCTGTCTGGTAACTGTTTTCTATT 
GGTGCATTCAAAAGCGCAGAAAGCAGAGCAGCCATACTCACACAGCATCTGATGACAACACCACCAAC 
AACGTAAGGGAGCAGCTGAATCAGATTAAAAACCCCATAGAGAAACACGGAGCAAATACTGTTCCAAT 
TAAAGACTATGAAAACAAAAACTCTAAAATCGCCAAAATAAGGACGCACAATTCAGAAGTGGAGGAAG 
ATGACATGGACAAACACCAGCAAAAGGCCCGGTTTGCCAAGCAGCCAGCGTACACTTTGGTAGACAGA 
GATGAAAAGCCACCCAACAGCACACCCACAAAACACCCAAACTGGACAAATAAACAGGACAACAGAGA 
CTTGGAAAGTGCACAAAGTTTAAATAGAATGGAGTACATTGTATAG 
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QVASASGQFE LEILSVQNVN GVLQNGNCCD GTRNPGDKKC TRDECDTYFK 50 

VCLKEYQSRV TAGGPCSFGS KSTPVIGGNT FNLKYSRNNE KNRIVIPFSF 100 

AWPRSYTLLV EAWDYNDNST NPDRIIEKAS HSGMINPSRQ WQTLKHNTGA 150 

AHFEYQIRVT CAEHYYGFGC NKFCRPRDDF FTEHTCDQNG NKTCLEGWTG 200 

********************dsl DOMA I n**************** 

PECNKAICRQ GCSPKHGSCT VPGECRCQYG WQGQYCDKCI PHPGCVHGTC 250 

*** < EGF 1 - >< 

1EPWQCLCET NWGGQLCDKD LNYCGTHPPC LNGGTCSNTG PDKYQCSCPE 300 

EGF 2--- >< EGF 3 *"" 



GYSGQNCEIA EHACLSDPCH NGGSCLETST GFECVCAPGW AGPTCTDNID 350 

><- EGF 4 - 

DCSPNPCGHG GTCODLVDGF KCICPPQWTG KTCQLDANEC EGKPCVNANS 400 

>< EFG 5 x 

CRNLIGSYYC DCITGWSGHN CDININDCRG QCQNGGSCRD LVNGYRCICS 450 



•EFG 6—- ><- EFG7- 



PGYAGDHCEK DINECASNPC MNGGHCODEI NGFQCLCPAG FSGNLCQLDI 500 

x EFG 8 

DYCEPNPCQN GAQCFNLAMD YFCNCPEDYE GKNCSHLKDH CRTTPCEVID 550 



-><■ 



■EFG 9 x. 



SCTVAVASNS TPEGVRYISS NVCGPHGKCK SQAGGKFTCE CNKGFTGTYC 600 

EFG 10 

HENINDCESN PCKNGGTCID GVNSYKC1CS DGWEGTYCET N1NDCSKNPC 650 

>< EFG 11 x 

HNGGTCRDLV NDFFCECKNG WKGKTCHSRD SOCDEATCNN GGTCYDEGDT 700 

EFG 12 x 

FKCMCPAGWE GATCNIARNS SCLPNPCHNG GTCVVSGDSF TCVCKEGWEG 750 

fgp 13 >< EGF 14 

PTCTQNTNDC SPHPCYNSGT CVDGDNWYRC ECAPGFAGPD CRININECOS 800 

>< EGF 15 x-- 

SPCAFGATCV DEINGYRCIC PPGR5GPGCQ EVTGRPCFTS I RVMPDGAKW 850 

EGF 16 — > „ nnn 

DDDCNTCQCL NGKVTCSKVW CGPRPCIIHA KGHNECPAGH ACVPVKEDHC 900 

CYSTEINE-RICH REGION 
FTHPCAAVGE CWPSNQQPVK TKCNSDSYYQ DNCANITFTF NKEMMAPGLT 950 



-> 



TEHICSELRN LNILKNVSAE YS1YITCEP5 HLANNEIHVA ISAED1GEDE 1000 
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NP1KEITDKI IDLVSKRDGN NTL1AAVAEV RVQRRPVKNK TDFLVPLLSS 1050 
VLTVAWICCL VTVFYWCIQK RRKOSSHTHT ASDDNTTNNV REOLNQIKNP 1100 
IEKHGANTVP IKDYENKNSK IAKIRTHNSE VEEDDMDKHQ QKARFAKQPA 1150 
YTLVDRDEKP PNSTPTKHPN WTNKQDNRDL ESAQSLNRME Y1V 1193 
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Claims Nos.: 



because they relate to subject matter not required to be searched by this Authority, namely: 



2. j | Claims Nos.: 



because they relate to parts of the international application that do not comply with the prescribed requirements to such 
an extent that no meaningful international search can be carried out. specifically: 



3 I I Claims Nos 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4<a>. 



Box II Observations where unity of invention is lacking (Continuation of item 2 of first sheet) 



This International Searching Authority found multiple inventions in this international application, as foUows: 
Please See Extra Sheet. 



1 . [~x| As all required additional search fees were timely paid by the applicant, this international search report covers all searchable 

claims. 

2. Q As all searchable claims could be searched without effort justifying an additional fee, this Authority did not inv.te payment 

ol any additional tee. 

3 [~] As only some ol the required additional search fees were timely paid by the applicant, this international search report covers 
only those claims lor which fees were paid, specifically claims Nos.: 



4 r~| No required additional search fees were timely paid by the applicant. Consequently, this international search report is 
1 1 restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 
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BOX II. OBSERVATIONS WHERE UNITY OF INVENTION WAS LACKING 
This ISA found multiple inventions as follows: 

This application contains the following inventions or groups of inventions which arc not so Linked as to form a single 
inventive concept under PCT Rule 13.1. In order for all inventions to be examined, the appropriate additional 
examination fees must be paid. 

Group I, cbimu)s 1-29. 34-68, 78, and 79, drawn to DNA and amino acid scqeunces encoding Serrate protein. 
Group II. ctaim(s) 30-33, drawn to an antibody against Serrate protein. 

Group III. claim(s) 69-73 and 77, drawn to a method of treating disease with Serrate protein. 

Group IV. claim(s) 74 and 75. drawn to a method of treating disease with DNA encoding Serrate protein - gene 
therapy. 

Group V. claim(s) 76. drawn to a method of treating a disease with the antibody against Serrate protein. 
Group VI. claim(s) 80. drawn to a method of inhibiting expression of Serrate protein using antisense DNA. 
Group VII, claim(s)81, drawn to method for diagnosing a disease via noich:Serrate protein binding assay. 
Group VIII, claim(s) 82. drawn to a method for diagnosing a disease via measuring Serrate protein levels. 



The inventions listed as Groups I through VIII do not relate to a single inventive concept under PCT Rule 13.1 because, 
under PCT Rule 13.2. they lack the same or corresponding special technical features, for the following reasons: 

Groups 1 and Group 11 arc structurally different compounds/compositions having unique properties not shared by the 
other. A reference anticipating or rendering obvious the compound/composition of Group I would not necessarily 
anticipate or make obvious the compound/composition of Group II. Groups III -VIII are properly grouped separately 
from the main invention ol Group I. pursuant to 37 CFR 1.475(d). Therefore, the groupings lack the same or 
corresponding special technical feature. 
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BOX II. OBSERVATIONS WHERE UNITY OF INVENTION WAS LACKING 
This ISA found multiple inventiona as follows: 

Group 1, claims 1-33. 35. 39. 41-43, drawn to a method for expansion of precursor cells. 
Group II, claim 34, drawn to a method of therapy. 
Group HI. claims 36-38, drawn to a method of inhibiting cell growth. 



as Groups I. U. and III do not relate to a single inventive concept under PCT Rule 13.1 because 
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